flatMap是否保证是懒惰的? [英] Is flatMap guaranteed to be lazy?
问题描述
考虑以下代码:
urls.stream()
.flatMap(url -> fetchDataFromInternet(url).stream())
.filter(...)
.findFirst()
.get();
将 fetchDataFromInternet
调用第二个网址时第一个就够了吗?
Will fetchDataFromInternet
be called for second url when the first one was enough?
我尝试了一个较小的例子,它看起来像预期的那样工作。即逐个处理数据但是可以依赖这种行为吗?如果没有,在 .flatMap(...)
帮助之前调用 .sequential()
?
I tried with a smaller example and it looks like working as expected. i.e processes data one by one but can this behavior be relied on? If not, does calling .sequential()
before .flatMap(...)
help?
Stream.of("one", "two", "three")
.flatMap(num -> {
System.out.println("Processing " + num);
// return FetchFromInternetForNum(num).data().stream();
return Stream.of(num);
})
.peek(num -> System.out.println("Peek before filter: "+ num))
.filter(num -> num.length() > 0)
.peek(num -> System.out.println("Peek after filter: "+ num))
.forEach(num -> {
System.out.println("Done " + num);
});
输出:
Processing one
Peek before filter: one
Peek after filter: one
Done one
Processing two
Peek before filter: two
Peek after filter: two
Done two
Processing three
Peek before filter: three
Peek after filter: three
Done three
更新:如果在实施方面有问题,请使用官方Oracle JDK8
Update: Using official Oracle JDK8 if that matters on implementation
回答:
根据以下评论和答案,flatmap部分是懒惰的。即完全读取第一个流,只有在需要时才会读取下一个流。阅读流是急切的,但阅读多个流是懒惰的。
Answer: Based on the comments and the answers below, flatmap is partially lazy. i.e reads the first stream fully and only when required, it goes for next. Reading a stream is eager but reading multiple streams is lazy.
如果出现这种情况,API应该让函数返回 Iterable
而不是流。
If this behavior is intended, the API should let the function return an Iterable
instead of a stream.
换句话说:链接
推荐答案
在当前实施下, flatmap
非常渴望;像任何其他有状态的中间操作(如已排序
和 distinct
)。并且很容易证明:
Under the current implementation, flatmap
is eager; like any other stateful intermediate operation (like sorted
and distinct
). And it's very easy to prove :
int result = Stream.of(1)
.flatMap(x -> Stream.generate(() -> ThreadLocalRandom.current().nextInt()))
.findFirst()
.get();
System.out.println(result);
这永远不会完成,因为 flatMap
是急切地计算的。例如:
This never finishes as flatMap
is computed eagerly. For your example:
urls.stream()
.flatMap(url -> fetchDataFromInternet(url).stream())
.filter(...)
.findFirst()
.get();
这意味着对于每个 url
, flatMap
将阻止其后的所有其他操作,即使您关心单个操作。因此,假设从单个 url
您的 fetchDataFromInternet(url)
生成 10_000
行,你的 findFirst
必须等待计算所有 10_000,即使你只关心一个。
It means that for each url
, the flatMap
will block all others operation that come after it, even if you care about a single one. So let's suppose that from a single url
your fetchDataFromInternet(url)
generates 10_000
lines, well your findFirst
will have to wait for all 10_000 to be computed, even if you care about only one.
编辑
这在Java 10中得到修复,我们在这里得到了懒惰:请参阅 JDK-8075939
This is fixed in Java 10, where we get our laziness back: see JDK-8075939
这篇关于flatMap是否保证是懒惰的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!