番石榴的Streams :: findLast实现 [英] guava's Streams::findLast implementation

查看:124
本文介绍了番石榴的Streams :: findLast实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究番石榴中Streams::findLast的实现,并试图理解它时,有几件事我根本无法掌握.这是它的实现:

I am looking into the implementation of Streams::findLast from guava and while trying to understand it, there were a couple of things that simply I could not grasp. Here is it's implementation:

public static <T> java.util.Optional<T> findLast(Stream<T> stream) {
    class OptionalState {

        boolean set = false;
        T value = null;

        void set(@Nullable T value) {
            set = true;
            this.value = value;
        }

        T get() {
            checkState(set);
            return value;
        }
    }
    OptionalState state = new OptionalState();

    Deque<Spliterator<T>> splits = new ArrayDeque<>();
    splits.addLast(stream.spliterator());

    while (!splits.isEmpty()) {
        Spliterator<T> spliterator = splits.removeLast();

        if (spliterator.getExactSizeIfKnown() == 0) {
            continue; // drop this split
        }

        // Many spliterators will have trySplits that are SUBSIZED even if they are not themselves
        // SUBSIZED.
        if (spliterator.hasCharacteristics(Spliterator.SUBSIZED)) {
            // we can drill down to exactly the smallest nonempty spliterator
            while (true) {
                Spliterator<T> prefix = spliterator.trySplit();
                if (prefix == null || prefix.getExactSizeIfKnown() == 0) {
                    break;
                } else if (spliterator.getExactSizeIfKnown() == 0) {
                    spliterator = prefix;
                    break;
                }
            }

            // spliterator is known to be nonempty now
            spliterator.forEachRemaining(state::set);
            return java.util.Optional.of(state.get());
        }

        Spliterator<T> prefix = spliterator.trySplit();
        if (prefix == null || prefix.getExactSizeIfKnown() == 0) {
            // we can't split this any further
            spliterator.forEachRemaining(state::set);
            if (state.set) {
                return java.util.Optional.of(state.get());
            }
            // fall back to the last split
            continue;
        }
        splits.addLast(prefix);
        splits.addLast(spliterator);
    }
    return java.util.Optional.empty();
}

从本质上讲,实现并不是那么复杂,但是我发现这有些奇怪(如果这个问题被归结为基于观点的",我会在这里指责)可能会发生.)

In essence the implementation is not that complicated to be honest, but here are the things that I find a bit weird (and I'll take the blame here if this question gets closed as "opinion-based", I understand it might happen).

首先是创建OptionalState类,该类可能已替换为单个元素的数组:

First of all is the creation of OptionalState class, this could have been replaced with an array of a single element:

T[] state = (T[]) new Object[1];

,用法很简单:

spliterator.forEachRemaining(x -> state[0] = x);

然后将整个方法分为三部分:

Then the entire method could be split into 3 pieces:

1)当已知某个分隔符为空时:

1) when a certain Spliterator is known to be empty:

if (spliterator.getExactSizeIfKnown() == 0) 

在这种情况下,很简单-放下它.

In this case it's easy - just drop it.

2),然后如果已知分隔符为SUBSIZED.这是幸福之路"的情景;因为在这种情况下,我们可以将其拆分,直到到达最后一个元素.基本上,实现说:拆分直到prefixnull或为空(在这种情况下,使用正确的"分隔符),或者如果在拆分之后已知正确的"分隔符为空,则消耗一.这是通过以下方式完成的:

2) then if the Spliterator is known to be SUBSIZED. This is the "happy-path" scenario; as in this case we can split this until we get to the last element. Basically the implementation says: split until the prefix is either null or it's empty (in which case consume the "right" spliterator) or if after a split the "right" spliterator is known to be empty, consume the prefix one. This is done via:

// spliterator is known to be nonempty now
spliterator.forEachRemaining(state::set);
return java.util.Optional.of(state.get());

我实际上有关于此评论的

第二个问题:

Second question I have is actually about this comment:

// Many spliterators will have trySplits that are SUBSIZED 
// even if they are not themselves SUBSIZED.

这很有趣,但是我找不到这样的例子,如果有人向我介绍一个例子,我将不胜感激.实际上,由于存在此注释,因此下一个代码(方法的第3部分不能像第二个一样使用while(true)完成),因为它假定在trySplit之后我们可以获得一个Spliterator就是SUBSIZED,即使我们最初的Spliterator不是,所以它必须转到findLast的开头.

This is very interesting, but I could not find such an example, would appreciate if someone would introduce me to one. As a matter of fact, because this comment exists, the code in the next (3-rd part of the method can not be done with a while(true) like the second), because it assumes that after a trySplit we could obtain a Spliterator that is SUBSIZED, even if our initial one was not, so it has to go to the very beginning of findLast.

3)方法的这一部分是在已知分隔符 not SUBSIZED的情况下,并且在这种情况下,分隔符没有已知的大小;因此,它依赖于如何实现源代码的分隔符,在这种情况下,实际上findLast几乎没有什么意义.例如,来自HashSetSpliterator将返回最后一个存储桶中的最后一个条目. ..

3) this part of the method is when a Spliterator is known not to be SUBSIZED and in this case it does not have a known size; thus it relies on how the Spliterator from the source is implemented and in this case actually a findLast makes little sense... for example a Spliterator from a HashSet will return whatever the last entry is in the last bucket...

推荐答案

  1. 当迭代未知大小的Spliterator时,必须跟踪是否遇到了元素.这可以通过调用tryAdvance并使用返回值来完成,或者通过将forEachRemainingConsumer一起使用来记录是否遇到了元素.当您走后一条路线时,专用类比数组简单.而且,一旦有了专用的类,为什么不也将它用于SIZED分隔器.

  1. When you iterate a Spliterator of an unknown size, you have to track whether an element has been encountered. This can be done by calling tryAdvance and using the return value or by using forEachRemaining with a Consumer which records whether an element has been encountered. When you go the latter route, a dedicated class is simpler than an array. And once you have a dedicated class, why not use it for the SIZED spliterator as well.

对我来说奇怪的是,这个仅存在于Consumer的本地类并没有实现Consumer,而是需要通过state::set进行绑定.

What’s strange to me, is that this local class, which only exists to be used as a Consumer, doesn’t implement Consumer but requires the binding via state::set.

考虑

Stream.concat(
    Stream.of("foo").filter(s -> !s.isEmpty()),
    Stream.of("bar", "baz"))

代表整个流的Spliterator不能具有SIZED特性.但是,当拆分大小未知的第一个子流时,其余的流将具有已知的大小.

The Spliterator representing the entire stream can’t have the SIZED characteristic. But when splitting off the first substream with the unknown size, the remaining stream has a known size.

测试代码:

Spliterator<String> sp = Stream.concat(
    Stream.of("foo").filter(s -> !s.isEmpty()),
    Stream.of("bar", "baz"))
    .spliterator();
do {
    System.out.println(
          "SIZED: "+sp.hasCharacteristics(Spliterator.SIZED)
        + ", SUBSIZED: "+sp.hasCharacteristics(Spliterator.SUBSIZED)
        + ", exact size if known: "+sp.getExactSizeIfKnown());
} while(sp.trySplit() != null);

结果:

SIZED: false, SUBSIZED: false, exact size if known: -1
SIZED: true, SUBSIZED: true, exact size if known: 2
SIZED: true, SUBSIZED: true, exact size if known: 1

但是对我来说,当有人在评论中告诉您知道拆分可以更改特征然后使用SUBSIZED进行预测试时,看起来很奇怪,而不是仅仅进行拆分并检查结果是否已知尺寸.毕竟,当特征不存在时,代码仍然会在备用分支中进行拆分.在我的旧答案中,我做了预测试以避免分配数据结构,但是在这里,总是创建ArrayDeque并用过的.但是我认为,即使是我以前的答案也可以简化.

But to me, it looks weird when someone tells in a comment to know that splitting can change the characteristics and then doing a pre-test with SUBSIZED, instead of just doing the split and check whether the result has a known size. After all, the code is doing the split anyway, in the alternative branch, when the characteristic is not present. In my old answer, I did the pretest to avoid allocating data structures, but here, the ArrayDeque is always created and used. But I think, even my old answer could be simplified.

我不确定您的目标是什么.当Spliterator具有ORDERED特性时,遍历和拆分的顺序是明确定义的.由于HashSet没有排序,因此术语最后"是没有意义的.如果您是激进的,则可以优化操作以仅返回无序流的第一个元素;这是有效的,而且速度更快.

I’m not sure what you are aiming at. When a Spliterator has the ORDERED characteristic, the order of traversal and splitting is well-defined. Since HashSet is not ordered, the term "last" is meaningless. If you are radical, you could optimize the operation to just return the first element for unordered streams; that’s valid and much faster.

奇怪的是,这种情况是

if (prefix == null || prefix.getExactSizeIfKnown() == 0) {
    // we can't split this any further

(以及SUBSIZED路径中的类似循环终止)

(and a similar loop termination in the SUBSIZED path)

仅仅因为一个前缀的大小已知为零,并不意味着后缀不能进一步分开.规范中什么也没说.

Just because one prefix happened to have a known zero size, it does not imply that the suffix can’t split further. Nothing in the specification says that.

由于这种情况,Stream.concat(Stream.of("foo"), Stream.of("bar","baz"))可以得到最佳处理,而对于Stream.concat(Stream.of(), Stream.of("bar", "baz")),它将回落到遍历中,因为第一个前缀的已知大小为零.

As a consequence of this condition, Stream.concat(Stream.of("foo"), Stream.of("bar","baz")) can be handled optimally, whereas for Stream.concat(Stream.of(), Stream.of("bar", "baz")), it will fall back to a traversal, because the first prefix has a known size of zero.

这篇关于番石榴的Streams :: findLast实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆