什么更有效:排序流或排序列表? [英] What is more efficient: sorted stream or sorting a list?

查看:117
本文介绍了什么更有效:排序流或排序列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们在集合中有一些项目,我们想要使用某个比较器对它们进行排序,期望在列表中得到结果:

Assume we have some items in a collection and we want to sort them using certain comparator, expecting result in a list:

Collection<Item> items = ...;
Comparator<Item> itemComparator = ...;

其中一种方法是对列表中的项目进行排序,例如:

One of the approaches is to sort items in a list, something like:

List<Item> sortedItems = new ArrayList<>(items);
Collections.sort(sortedItems, itemComparator);

另一种方法是使用排序流:

Anothe approach is using a sorted stream:

List<Item> sortedItems = items
    .stream()
    .sorted(itemComparator)
    .collect(Collectors.toList());

我想知道哪种方法效率更高?排序流是否有任何优点(如多核上的紧固排序)?

I wonder, which approach is more efficient? Are there any advantages of a sorted stream (like faste sorting on multiple cores)?

从运行时复杂度/最快的角度来看,效率很高。

Efficient in a sense of runtime complexity/fastest.

我不相信自己能够实现一个完美的基准测试和学习 SortedOps 并没有真正启发我。

I don't trust myself to implement a perfect benchmark and studying SortedOps did not really enlighten me.

推荐答案

可以肯定地说,两种排序形式具有相同的复杂性......即使不查看代码也是如此。 (如果他们没有,那么一个表格将被严重破坏!)

It is safe to say that two forms of sort will have the same complexity ... even without looking at the code. (If they didn't then one form would be severely broken!)

查看流的Java 8源代码(特别是内部类 java .util.stream.SortedOps ), sorted()方法将一个组件添加到流管道中,该管道将所有流元素捕获到一个数组或 ArrayList

Looking at Java 8 source code for streams (specifically the internal class java.util.stream.SortedOps), the sorted() method adds a component to a stream pipeline that captures all of the stream elements into either an array or an ArrayList.


  • 当且仅使用数组如果管道汇编代码可以提前推断出流中的元素数量。

  • An array is used if and only if the pipeline assembly code can deduce the number of elements in the stream ahead of time.

否则, ArrayList 用于收集要排序的元素。

Otherwise, an ArrayList is used to gather the elements to be sorted.

如果使用 ArrayList ,则会产生额外的构建/增长开销列表。

If an ArrayList is used, you incur the extra overhead of building / growing the list.

然后我们返回两个版本的代码:

Then we return to two versions of the code:

List<Item> sortedItems = new ArrayList<>(items);
Collections.sort(sortedItems, itemComparator);

在此版本中, ArrayList 构造函数副本元素 items 到适当大小的数组,然后 Collections.sort 执行该数组的就地排序。 (这发生在封面下)。

In this version, the ArrayList constructor copies the elements items to an appropriately sized array, and then Collections.sort does an in-place sort of that array. (This happens under the covers).

List<Item> sortedItems = items
    .stream()
    .sorted(itemComparator)
    .collect(Collectors.toList());

在这个版本中,正如我们上面所看到的,与关联的代码已经排序()构建和排序数组(相当于上面发生的情况)或者以慢速方式构建 ArrayList 。但最重要的是,从到收集器的数据流的开销很大。

In this version, as we have seen above, the code associated with sorted() either builds and sorts an array (equivalent to what happens above) or it builds the ArrayList the slow way. But on top of that, there are the overheads of stream the data from items and to the collector.

总体而言(至少使用Java 8实现)代码检查告诉我代码的第一个版本不能慢于第二个版本,并且在大多数(如果不是全部)情况下它会更快。但随着列表变大, O(NlogN)排序将倾向于支配 O(N)的开销。复制。这意味着两个版本之间的相对差异会变小。

Overall (with the Java 8 implementation at least) code examination tells me that first version of the code cannot be slower than the second version, and in most (if not all) cases it will be faster. But as the list gets larger, the O(NlogN) sorting will tend to dominate the O(N) overheads of copying. That will mean that the relative difference between the two versions will get smaller.

如果你真的在乎,你应该能够写一个基准测试Java的特定实现和特定输入数据集的实际差异。 (或者适应@Eugene的基准!)

If you really care, you should be able to write a benchmark to test the actual difference with a specific implementation of Java, and a specific input dataset. (Or adapt @Eugene's benchmark!)

这篇关于什么更有效:排序流或排序列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆