番石榴的Streams :: findLast实现 [英] guava's Streams::findLast implementation
问题描述
我正在研究番石榴中Streams::findLast
的实现,并试图理解它时,有几件事我根本无法掌握.这是它的实现:
I am looking into the implementation of Streams::findLast
from guava and while trying to understand it, there were a couple of things that simply I could not grasp. Here is it's implementation:
public static <T> java.util.Optional<T> findLast(Stream<T> stream) {
class OptionalState {
boolean set = false;
T value = null;
void set(@Nullable T value) {
set = true;
this.value = value;
}
T get() {
checkState(set);
return value;
}
}
OptionalState state = new OptionalState();
Deque<Spliterator<T>> splits = new ArrayDeque<>();
splits.addLast(stream.spliterator());
while (!splits.isEmpty()) {
Spliterator<T> spliterator = splits.removeLast();
if (spliterator.getExactSizeIfKnown() == 0) {
continue; // drop this split
}
// Many spliterators will have trySplits that are SUBSIZED even if they are not themselves
// SUBSIZED.
if (spliterator.hasCharacteristics(Spliterator.SUBSIZED)) {
// we can drill down to exactly the smallest nonempty spliterator
while (true) {
Spliterator<T> prefix = spliterator.trySplit();
if (prefix == null || prefix.getExactSizeIfKnown() == 0) {
break;
} else if (spliterator.getExactSizeIfKnown() == 0) {
spliterator = prefix;
break;
}
}
// spliterator is known to be nonempty now
spliterator.forEachRemaining(state::set);
return java.util.Optional.of(state.get());
}
Spliterator<T> prefix = spliterator.trySplit();
if (prefix == null || prefix.getExactSizeIfKnown() == 0) {
// we can't split this any further
spliterator.forEachRemaining(state::set);
if (state.set) {
return java.util.Optional.of(state.get());
}
// fall back to the last split
continue;
}
splits.addLast(prefix);
splits.addLast(spliterator);
}
return java.util.Optional.empty();
}
从本质上讲,实现并不是那么复杂,但是我发现这有些奇怪(如果这个问题被归结为基于观点的",我会在这里指责)可能会发生.)
In essence the implementation is not that complicated to be honest, but here are the things that I find a bit weird (and I'll take the blame here if this question gets closed as "opinion-based", I understand it might happen).
首先是创建OptionalState
类,该类可能已替换为单个元素的数组:
First of all is the creation of OptionalState
class, this could have been replaced with an array of a single element:
T[] state = (T[]) new Object[1];
,用法很简单:
spliterator.forEachRemaining(x -> state[0] = x);
然后将整个方法分为三部分:
Then the entire method could be split into 3 pieces:
1)当已知某个分隔符为空时:
1) when a certain Spliterator is known to be empty:
if (spliterator.getExactSizeIfKnown() == 0)
在这种情况下,很简单-放下它.
In this case it's easy - just drop it.
2),然后如果已知分隔符为SUBSIZED
.这是幸福之路"的情景;因为在这种情况下,我们可以将其拆分,直到到达最后一个元素.基本上,实现说:拆分直到prefix
为null
或为空(在这种情况下,使用正确的"分隔符),或者如果在拆分之后已知正确的"分隔符为空,则消耗
2) then if the Spliterator is known to be SUBSIZED
. This is the "happy-path" scenario; as in this case we can split this until we get to the last element. Basically the implementation says: split until the prefix
is either null
or it's empty (in which case consume the "right" spliterator) or if after a split the "right" spliterator is known to be empty, consume the prefix
one. This is done via:
// spliterator is known to be nonempty now
spliterator.forEachRemaining(state::set);
return java.util.Optional.of(state.get());
我实际上有关于此评论的
第二个问题:
Second question I have is actually about this comment:
// Many spliterators will have trySplits that are SUBSIZED
// even if they are not themselves SUBSIZED.
这很有趣,但是我找不到这样的例子,如果有人向我介绍一个例子,我将不胜感激.实际上,由于存在此注释,因此下一个代码(方法的第3部分不能像第二个一样使用while(true)
完成),因为它假定在trySplit
之后我们可以获得一个Spliterator
就是SUBSIZED
,即使我们最初的Spliterator
不是,所以它必须转到findLast
的开头.
This is very interesting, but I could not find such an example, would appreciate if someone would introduce me to one. As a matter of fact, because this comment exists, the code in the next (3-rd part of the method can not be done with a while(true)
like the second), because it assumes that after a trySplit
we could obtain a Spliterator
that is SUBSIZED
, even if our initial one was not, so it has to go to the very beginning of findLast
.
3)方法的这一部分是在已知分隔符 not 为SUBSIZED
的情况下,并且在这种情况下,分隔符没有已知的大小;因此,它依赖于如何实现源代码的分隔符,在这种情况下,实际上findLast
几乎没有什么意义.例如,来自HashSet
的Spliterator
将返回最后一个存储桶中的最后一个条目. ..
3) this part of the method is when a Spliterator is known not to be SUBSIZED
and in this case it does not have a known size; thus it relies on how the Spliterator from the source is implemented and in this case actually a findLast
makes little sense... for example a Spliterator
from a HashSet
will return whatever the last entry is in the last bucket...
推荐答案
-
当迭代未知大小的
Spliterator
时,必须跟踪是否遇到了元素.这可以通过调用tryAdvance
并使用返回值来完成,或者通过将forEachRemaining
与Consumer
一起使用来记录是否遇到了元素.当您走后一条路线时,专用类比数组简单.而且,一旦有了专用的类,为什么不也将它用于SIZED
分隔器.
When you iterate a
Spliterator
of an unknown size, you have to track whether an element has been encountered. This can be done by callingtryAdvance
and using the return value or by usingforEachRemaining
with aConsumer
which records whether an element has been encountered. When you go the latter route, a dedicated class is simpler than an array. And once you have a dedicated class, why not use it for theSIZED
spliterator as well.
对我来说奇怪的是,这个仅存在于Consumer
的本地类并没有实现Consumer
,而是需要通过state::set
进行绑定.
What’s strange to me, is that this local class, which only exists to be used as a Consumer
, doesn’t implement Consumer
but requires the binding via state::set
.
考虑
Stream.concat(
Stream.of("foo").filter(s -> !s.isEmpty()),
Stream.of("bar", "baz"))
代表整个流的Spliterator
不能具有SIZED
特性.但是,当拆分大小未知的第一个子流时,其余的流将具有已知的大小.
The Spliterator
representing the entire stream can’t have the SIZED
characteristic. But when splitting off the first substream with the unknown size, the remaining stream has a known size.
测试代码:
Spliterator<String> sp = Stream.concat(
Stream.of("foo").filter(s -> !s.isEmpty()),
Stream.of("bar", "baz"))
.spliterator();
do {
System.out.println(
"SIZED: "+sp.hasCharacteristics(Spliterator.SIZED)
+ ", SUBSIZED: "+sp.hasCharacteristics(Spliterator.SUBSIZED)
+ ", exact size if known: "+sp.getExactSizeIfKnown());
} while(sp.trySplit() != null);
结果:
SIZED: false, SUBSIZED: false, exact size if known: -1
SIZED: true, SUBSIZED: true, exact size if known: 2
SIZED: true, SUBSIZED: true, exact size if known: 1
但是对我来说,当有人在评论中告诉您知道拆分可以更改特征然后使用SUBSIZED
进行预测试时,看起来很奇怪,而不是仅仅进行拆分并检查结果是否已知尺寸.毕竟,当特征不存在时,代码仍然会在备用分支中进行拆分.在我的旧答案中,我做了预测试以避免分配数据结构,但是在这里,总是创建ArrayDeque
并用过的.但是我认为,即使是我以前的答案也可以简化.
But to me, it looks weird when someone tells in a comment to know that splitting can change the characteristics and then doing a pre-test with SUBSIZED
, instead of just doing the split and check whether the result has a known size. After all, the code is doing the split anyway, in the alternative branch, when the characteristic is not present. In my old answer, I did the pretest to avoid allocating data structures, but here, the ArrayDeque
is always created and used. But I think, even my old answer could be simplified.
我不确定您的目标是什么.当Spliterator
具有ORDERED
特性时,遍历和拆分的顺序是明确定义的.由于HashSet
没有排序,因此术语最后"是没有意义的.如果您是激进的,则可以优化操作以仅返回无序流的第一个元素;这是有效的,而且速度更快.
I’m not sure what you are aiming at. When a Spliterator
has the ORDERED
characteristic, the order of traversal and splitting is well-defined. Since HashSet
is not ordered, the term "last" is meaningless. If you are radical, you could optimize the operation to just return the first element for unordered streams; that’s valid and much faster.
奇怪的是,这种情况是
if (prefix == null || prefix.getExactSizeIfKnown() == 0) {
// we can't split this any further
(以及SUBSIZED
路径中的类似循环终止)
(and a similar loop termination in the SUBSIZED
path)
仅仅因为一个前缀的大小已知为零,并不意味着后缀不能进一步分开.规范中什么也没说.
Just because one prefix happened to have a known zero size, it does not imply that the suffix can’t split further. Nothing in the specification says that.
由于这种情况,Stream.concat(Stream.of("foo"), Stream.of("bar","baz"))
可以得到最佳处理,而对于Stream.concat(Stream.of(), Stream.of("bar", "baz"))
,它将回落到遍历中,因为第一个前缀的已知大小为零.
As a consequence of this condition, Stream.concat(Stream.of("foo"), Stream.of("bar","baz"))
can be handled optimally, whereas for Stream.concat(Stream.of(), Stream.of("bar", "baz"))
, it will fall back to a traversal, because the first prefix has a known size of zero.
这篇关于番石榴的Streams :: findLast实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!