排序numpy的结构和记录阵列的速度很慢 [英] sorting numpy structured and record arrays is very slow

查看:164
本文介绍了排序numpy的结构和记录阵列的速度很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

它看起来像由一个单一的列排序numpy的结构和记录阵列比同类独立阵列做一个排序慢得多:

it looks like sorting numpy structured and record arrays by a single column is much slower than doing a sort on a similar standalone array:

In [111]: a = np.random.rand(1e4)

In [112]: b = np.random.rand(1e4)

In [113]: rec = np.rec.fromarrays([a,b])

In [114]: timeit rec.argsort(order='f0')
100 loops, best of 3: 18.8 ms per loop

In [115]: timeit a.argsort()
1000 loops, best of 3: 891 µs per loop

有是使用结构化阵列略有改善,但它不是戏剧性的:

There is a marginal improvement using the structured array, but it's not dramatic:

In [120]: struct = np.empty(len(a),dtype=[('a','f8'),('b','f8')])

In [121]: struct['a'] = a

In [122]: struct['b'] = b

In [124]: timeit struct.argsort(order='a')
100 loops, best of 3: 15.8 ms per loop

这表明它可能更快地创建argsort一个索引数组,然后用它来重新排列各个阵列。这是除了我期望非常大的阵列打交道,并想避免复制数据尽可能确定。是否有这样做的我漏掉了一个更有效的方法?

This indicates that it's potentially faster to create an index array from argsort and then use that to reorder the individual arrays. This is OK except that I expect to be dealing with very large arrays and would like to avoid copying data as much as possible. Is there a more efficient way of doing this that I'm missing?

推荐答案

由于海梅说,你可以使用 argsort 来记录阵列进行排序。

As Jaime have said, you can use argsort to sort the record array.

inds = np.argsort(rec['f0'])

和使用来避免拷贝

np.take(rec, inds, out=rec)

这篇关于排序numpy的结构和记录阵列的速度很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆