numpy数组的并行就地排序 [英] Parallel in-place sort for numpy arrays

查看:128
本文介绍了numpy数组的并行就地排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常需要对大型的numpy数组(数十亿个元素)进行排序,这成为了我的代码的瓶颈.我正在寻找一种并行化它的方法.

I often need to sort large numpy arrays (few billion elements), which became a bottleneck of my code. I am looking for a way to parallelize it.

ndarray.sort()函数是否有任何并行实现? Numexpr模块为numpy数组上的大多数数学运算提供并行实现,但缺少排序功能.

Are there any parallel implementations for the ndarray.sort() function? Numexpr module provides parallel implementation for most math operations on numpy arrays, but lacks sorting capabilities.

也许可以围绕C ++并行排序实现一个简单的包装,并通过Cython使用它?

Maybe, it is possible to make a simple wrapper around a C++ implementation of parallel sorting, and use it through Cython?

推荐答案

我最终包装了GCC并行排序.这是代码:

I ended up wrapping GCC parallel sort. Here is the code:

parallelSort.pyx

parallelSort.pyx

# cython: wraparound = False
# cython: boundscheck = False
import numpy as np
cimport numpy as np
import cython
cimport cython 

ctypedef fused real:
    cython.char
    cython.uchar
    cython.short
    cython.ushort
    cython.int
    cython.uint
    cython.long
    cython.ulong
    cython.longlong
    cython.ulonglong
    cython.float
    cython.double

cdef extern from "<parallel/algorithm>" namespace "__gnu_parallel":
    cdef void sort[T](T first, T last) nogil 

def numpyParallelSort(real[:] a):
    "In-place parallel sort for numpy types"
    sort(&a[0], &a[a.shape[0]])

额外的编译器参数:-fopenmp(编译)和-lgomp(链接)

Extra compiler args: -fopenmp (compile) and -lgomp (linking)

此makefile将执行此操作:

This makefile will do it:

all:
    cython --cplus parallelSort.pyx  
    g++  -g -march=native -Ofast -fpic -c    parallelSort.cpp -o parallelSort.o -fopenmp `python-config --includes`
    g++  -g -march=native -Ofast -shared  -o parallelSort.so parallelSort.o `python-config --libs` -lgomp 

clean:
    rm -f parallelSort.cpp *.o *.so

这表明它有效:

from parallelSort import numpyParallelSort
import numpy as np 
a = np.random.random(100000000)

numpyParallelSort(a) 
print a[:10]

修复了以下评论中发现的错误

edit: fixed bug noticed in the comment below

这篇关于numpy数组的并行就地排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆