如何比快速排序更快地对整数数组进行排序? [英] How to sort an array of integers faster than quicksort?

查看:141
本文介绍了如何比快速排序更快地对整数数组进行排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用numpy的quicksort排序整数数组已成为 我算法的瓶颈.不幸的是,numpy没有 基数排序. 尽管计数排序在numpy中是单线的:

Sorting an array of integers with numpy's quicksort has become the bottleneck of my algorithm. Unfortunately, numpy does not have radix sort yet. Although counting sort would be a one-liner in numpy:

np.repeat(np.arange(1+x.max()), np.bincount(x))

请参见已接受的答案,如何对这种python计数排序向量进行矢量化处理,以使其速度尽可能的快?0运行到2**32.

see the accepted answer to the How can I vectorize this python count sort so it is absolutely as fast as it can be? question, the integers in my application can run from 0 to 2**32.

我坚持使用quicksort吗?

Am I stuck with quicksort?

该帖子的主要动机是 使用itertools.groupby性能对数字进行分组 问题.
另请注意 不仅可以自己提问和回答问题,明确鼓励.

This post was primarily motivated by the Numpy grouping using itertools.groupby performance question.
Also note that it is not merely OK to ask and answer your own question, it is explicitly encouraged.

推荐答案

不,您不会被quicksort所困扰.例如,您可以使用 integer_sort来自 Boost.Sort u4_sort来自 usort .对数组进行排序时:

No, you are not stuck with quicksort. You could use, for example, integer_sort from Boost.Sort or u4_sort from usort. When sorting this array:

array(randint(0, high=1<<32, size=10**8), uint32)

我得到以下结果:

NumPy quicksort:         8.636 s  1.0  (baseline)
Boost.Sort integer_sort: 4.327 s  2.0x speedup
usort u4_sort:           2.065 s  4.2x speedup

基于这个单一的实验和使用,我不会得出结论. usort盲目地.我会用我的实际数据进行测试并衡量会发生什么. 您的里程 会有所不同,具体取决于您的数据和计算机.这 Boost.Sort中的integer_sort具有丰富的调整选项集,请参见 文档.

I would not jump to conclusions based on this single experiment and use usort blindly. I would test with my actual data and measure what happens. Your mileage will vary depending on your data and on your machine. The integer_sort in Boost.Sort has a rich set of options for tuning, see the documentation.

以下,我描述了两种从Python调用本机C或C ++函数的方法.尽管描述很长,但是这样做还是很容易的.

Below I describe two ways to call a native C or C++ function from Python. Despite the long description, it's fairly easy to do it.

提升排序

将这些行放入spreadsort.cpp文件中:

Put these lines into the spreadsort.cpp file:

#include <cinttypes>
#include "boost/sort/spreadsort/spreadsort.hpp"
using namespace boost::sort::spreadsort;

extern "C" {
    void spreadsort(std::uint32_t* begin,  std::size_t len) {
        integer_sort(begin, begin + len);
    }
}

它基本上实例化了32位模板化的integer_sort 无符号整数; extern "C"部分通过禁用来确保C链接 名字改头换面. 假设您正在使用gcc,并且必要的包含boost文件 在/tmp/boost_1_60_0目录下,您可以对其进行编译:

It basically instantiates the templated integer_sort for 32 bit unsigned integers; the extern "C" part ensures C linkage by disabling name mangling. Assuming you are using gcc and that the necessary include files of boost are under the /tmp/boost_1_60_0 directory, you can compile it:

g++ -O3 -std=c++11 -march=native -DNDEBUG -shared -fPIC -I/tmp/boost_1_60_0 spreadsort.cpp -o spreadsort.so  

要生成的关键标志是-fPIC 位置无关代码-shared生成一个 共享对象 .so文件. (有关详细信息,请阅读gcc文档.)

The key flags are -fPIC to generate position-independet code and -shared to generate a shared object .so file. (Read the docs of gcc for further details.)

然后,包装spreadsort() C ++函数 在Python中使用 ctypes :

Then, you wrap the spreadsort() C++ function in Python using ctypes:

from ctypes import cdll, c_size_t, c_uint32
from numpy import uint32
from numpy.ctypeslib import ndpointer

__all__ = ['integer_sort']

# In spreadsort.cpp: void spreadsort(std::uint32_t* begin,  std::size_t len)
lib = cdll.LoadLibrary('./spreadsort.so')
sort = lib.spreadsort
sort.restype = None
sort.argtypes = [ndpointer(c_uint32, flags='C_CONTIGUOUS'), c_size_t]

def integer_sort(arr):
    assert arr.dtype == uint32, 'Expected uint32, got {}'.format(arr.dtype)
    sort(arr, arr.size)

或者,您可以使用 cffi :

from cffi import FFI
from numpy import uint32

__all__ = ['integer_sort']

ffi = FFI()
ffi.cdef('void spreadsort(uint32_t* begin,  size_t len);')
C = ffi.dlopen('./spreadsort.so')

def integer_sort(arr):
    assert arr.dtype == uint32, 'Expected uint32, got {}'.format(arr.dtype)
    begin = ffi.cast('uint32_t*', arr.ctypes.data)
    C.spreadsort(begin, arr.size)

cdll.LoadLibrary()ffi.dlopen()调用中,我假设 spreadsort.so文件的路径是./spreadsort.so.或者, 你可以写

At the cdll.LoadLibrary() and ffi.dlopen() calls I assumed that the path to the spreadsort.so file is ./spreadsort.so. Alternatively, you can write

lib = cdll.LoadLibrary('spreadsort.so')

C = ffi.dlopen('spreadsort.so')

如果将spreadsort.so的路径附加到LD_LIBRARY_PATH环境 多变的.另请参见共享库.

if you append the path to spreadsort.so to the LD_LIBRARY_PATH environment variable. See also Shared Libraries.

用法.在两种情况下,您只需调用上述Python包装函数integer_sort() 用32位无符号整数的numpy数组.

Usage. In both cases you simply call the above Python wrapper function integer_sort() with your numpy array of 32 bit unsigned integers.

usort

对于u4_sort,您可以如下进行编译:

As for u4_sort, you can compile it as follows:

cc -DBUILDING_u4_sort -I/usr/include -I./ -I../ -I../../ -I../../../ -I../../../../ -std=c99 -fgnu89-inline -O3 -g -fPIC -shared -march=native u4_sort.c -o u4_sort.so

u4_sort.c文件所在的目录中发出此命令. (可能没有那么多hacker的方法,但是我没有弄清楚.我 只是查看了usort目录中的deps.mk文件以查找 必要的编译器标志并包含路径.)

Issue this command in the directory where the u4_sort.c file is located. (Probably there is a less hackish way but I failed to figure that out. I just looked into the deps.mk file in the usort directory to find out the necessary compiler flags and include paths.)

然后,您可以如下包装C函数:

Then, you can wrap the C function as follows:

from cffi import FFI
from numpy import uint32

__all__ = ['integer_sort']

ffi = FFI()
ffi.cdef('void u4_sort(unsigned* a, const long sz);')
C = ffi.dlopen('u4_sort.so')

def integer_sort(arr):
    assert arr.dtype == uint32, 'Expected uint32, got {}'.format(arr.dtype)
    begin = ffi.cast('unsigned*', arr.ctypes.data)
    C.u4_sort(begin, arr.size)

在上面的代码中,我假设到u4_sort.so的路径是 附加到LD_LIBRARY_PATH环境变量.

In the above code, I assumed that the path to u4_sort.so has been appended to the LD_LIBRARY_PATH environment variable.

用法.像以前使用Boost.Sort一样,只需使用32位无符号整数的numpy数组调用上述Python包装函数integer_sort().

Usage. As before with Boost.Sort, you simply call the above Python wrapper function integer_sort() with your numpy array of 32 bit unsigned integers.

这篇关于如何比快速排序更快地对整数数组进行排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆