使用distinct()和collect(toSet())是否值得 [英] Is it worth using distinct() with collect(toSet())

查看:733
本文介绍了使用distinct()和collect(toSet())是否值得的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在将流的元素收集到集合中时,在流上指定 .distinct()是否有任何优势(或缺点)?例如:

When collecting the elements of a stream into a set, is there any advantage (or drawback) to also specifying .distinct() on the stream? For example:

return items.stream().map(...).distinct().collect(toSet());

鉴于该集合已经删除重复项,这似乎是多余的,但它是否提供任何性能优势或坏处?答案取决于流是并行/顺序还是有序/无序?

Given that the set will already remove duplicates, this seems redundant, but does it offer any performance advantage or disadvantage? Does the answer depend on whether the stream is parallel/sequential or ordered/unordered?

推荐答案

根据 javadoc distinct 是一个有状态的中间操作。

According to the javadoc, distinct is a stateful intermediate operation.

如果您真的有 .distinct ,紧接着 .collect ,它并没有真正增加任何好处。也许如果 .distinct 实现比 Set 重复检查更具性能,你可能会获得一些好处,但如果你无论如何,收集到一套你最终会得到相同的结果。

If you literally have .distinct followed immediately by .collect, it doesn't really add any benefit. Maybe if the .distinct implementation is more performant than the Set duplication check, you might get some benefit, but if you're collecting to a set you're going to end up with the same result anyway.

另一方面,如果 .distinct 发生在 .map 操作之前,并且该特定映射是一项昂贵的操作,您可能会获得一些收益,因为您整体处理的数据较少。

If, on the other hand, .distinct occurs before your .map operation, and that particular mapping is an expensive operation, you may get some gains there because you're processing less data overall.

这篇关于使用distinct()和collect(toSet())是否值得的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆