通过链式操作快速降低流吞吐量? [英] Quickly degrading stream throughput with chained operations?

查看:52
本文介绍了通过链式操作快速降低流吞吐量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望简单的中间流操作(例如limit())的开销很小.但是这些示例之间的吞吐量差异实际上是很明显的:

I expected that simple intermediate stream operations, such as limit(), have very little overhead. But the difference in throughput between these examples is actually significant:

final long MAX = 5_000_000_000L;

LongStream.rangeClosed(0, MAX)
          .count();
// throughput: 1.7 bn values/second


LongStream.rangeClosed(0, MAX)
          .limit(MAX)
          .count();
// throughput: 780m values/second

LongStream.rangeClosed(0, MAX)
          .limit(MAX)
          .limit(MAX)
          .count();
// throughput: 130m values/second

LongStream.rangeClosed(0, MAX)
          .limit(MAX)
          .limit(MAX)
          .limit(MAX)
          .count();
// throughput: 65m values/second

我很好奇:吞吐量快速下降的原因是什么?它是与链式流操作或测试设置保持一致的模式吗? (到目前为止,我还没有使用JMH,只是用秒表进行了快速实验)

I am curious: What is the reason for the quickly degrading throughput? Is it a consistent pattern with chained stream operations or my test setup? (I did not use JMH so far, just set up a quick experiment with a stopwatch)

推荐答案

limit将导致使用流 slice split迭代器(用于并行操作).一言以蔽之:效率低下.无人操作的开销很大.而且两个连续的limit调用导致两个切片,这实在令人遗憾.

limit will result in a slice being made of the stream, with a split iterator (for parallel operation). In one word: inefficient. A large overhead for a no-op here. And that two consecutive limit calls result in two slices is a shame.

您应该看一下IntStream.limit的实现.

由于Streams仍然是相对较新的,因此优化应该排在最后;当生产代码存在时.进行3次极限似乎有些牵强.

As Streams are still relative new, optimization should come last; when production code exists. Doing limit 3 times seems a bit far-fetched.

这篇关于通过链式操作快速降低流吞吐量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆