并行排序List而不在Java 8中创建临时数组 [英] Sorting a List in parallel without creating a temporary array in Java 8

查看:118
本文介绍了并行排序List而不在Java 8中创建临时数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java 8提供 java.util.Arrays.parallelSort ,它使用fork-join框架并行排序数组。但是没有相应的 Collections.parallelSort 用于排序列表。

Java 8 provides java.util.Arrays.parallelSort, which sorts arrays in parallel using the fork-join framework. But there's no corresponding Collections.parallelSort for sorting lists.

我可以使用 toArray ,对该数组进行排序,并将结果存储回我的列表中,但这会暂时增加内存使用量,如果我使用并行排序已经很高,因为并行排序只能为巨额列表付出代价。我没有使用两倍的内存(列表加上parallelSort的工作内存),而是使用了三次(列表,临时数组和parallelSort的工作内存)。 (Arrays.parallelSort文档说算法需要的工作空间不大于原始数组的大小。)

I can use toArray, sort that array, and store the result back in my list, but that will temporarily increase memory usage, which if I'm using parallel sorting is already high because parallel sorting only pays off for huge lists. Instead of twice the memory (the list plus parallelSort's working memory), I'm using thrice (the list, the temporary array and parallelSort's working memory). (Arrays.parallelSort documentation says "The algorithm requires a working space no greater than the size of the original array".)

除了内存使用,Collections.parallelSort也是对于看似合理的常见操作更方便。 (我倾向于不直接使用数组,所以我肯定比Arrays.parallelSort使用它更频繁。)

Memory usage aside, Collections.parallelSort would also be more convenient for what seems like a reasonably common operation. (I tend not to use arrays directly, so I'd certainly use it more often than Arrays.parallelSort.)

库可以测试 RandomAccess 以避免尝试例如快速排序链表,这样就不能故意遗漏。

The library can test for RandomAccess to avoid trying to e.g. quicksort a linked list, so that can't a reason for a deliberate omission.

如何在不创建临时数组的情况下并行排序List?

推荐答案

似乎没有任何直接的方法来排序列表我认为这根本不困难;它看起来更像是对我的疏忽。

There doesn't appear to be any straightforward way to sort a List in parallel in Java 8. I don't think this is fundamentally difficult; it looks more like an oversight to me.

假设 Collections.parallelSort(list,cmp)的难度集合实现对列表的实现或其内部组织一无所知。通过检查 Collections.sort(list,cmp)的Java 7实现可以看出这一点。正如您所观察到的,它必须将列表元素复制到数组中,对它们进行排序,然后将它们复制回列表中。

The difficulty with a hypothetical Collections.parallelSort(list, cmp) is that the Collections implementation knows nothing about the list's implementation or its internal organization. This can be seen by examining the Java 7 implementation of Collections.sort(list, cmp). As you observed, it has to copy the list elements out to an array, sort them, and then copy them back into the list.

这是它的一大优势 List.sort(cmp) 上的扩展方法> Collections.sort(list,cmp)。看起来这只是一个小的语法优势,能够编写 myList.sort(cmp)而不是 Collections.sort(myList,cmp) 。区别在于 myList.sort(cmp),作为接口扩展方法,可以被特定的列表覆盖 实施。例如, ArrayList.sort(cmp)使用 Arrays.sort()就地排序列表,而默认值实现实现旧的copyout-sort-copyback技术。

This is the big advantage of the List.sort(cmp) extension method over Collections.sort(list, cmp). It might seem that this is merely a small syntactic advantage being able to write myList.sort(cmp) instead of Collections.sort(myList, cmp). The difference is that myList.sort(cmp), being an interface extension method, can be overridden by the specific List implementation. For example, ArrayList.sort(cmp) sorts the list in-place using Arrays.sort() whereas the default implementation implements the old copyout-sort-copyback technique.

应该可以添加 parallelSort 扩展方法 List 接口,它与 List.sort 具有相似的语义,但并行进行排序。这将允许 ArrayList 使用 Arrays.parallelSort 进行简单的就地排序。 (我并不完全清楚默认实现应该做什么。执行copyout-parallelSort-copyback可能仍然值得。)因为这将是一个API更改,所以直到Java SE的下一个主要版本才会发生。

It should be possible to add a parallelSort extension method to the List interface that has similar semantics to List.sort but does the sorting in parallel. This would allow ArrayList to do a straightforward in-place sort using Arrays.parallelSort. (It's not entirely clear to me what the default implementation should do. It might still be worth it to do copyout-parallelSort-copyback.) Since this would be an API change, it can't happen until the next major release of Java SE.

对于Java 8解决方案,有几个解决方法,没有一个非常漂亮(通常是解决方法)。您可以创建自己的基于数组的 List 实现,并覆盖 sort()以并行排序。或者你可以继承 ArrayList ,覆盖 sort(),抓住 elementData 数组通过反射并调用 parallelSort()就可以了。当然你可以编写自己的 List 实现并提供 parallelSort()方法,但是覆盖的优势 List.sort()这是在普通的 List 界面上工作,你不必修改所有的代码在您的代码库中使用不同的 List 子类。

As for a Java 8 solution, there are a couple workarounds, none very pretty (as is typical of workarounds). You could create your own array-based List implementation and override sort() to sort in parallel. Or you could subclass ArrayList, override sort(), grab the elementData array via reflection and call parallelSort() on it. Of course you could just write your own List implementation and provide a parallelSort() method, but the advantage of overriding List.sort() is that this works on the plain List interface and you don't have to modify all the code in your code base to use a different List subclass.

这篇关于并行排序List而不在Java 8中创建临时数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆