快速的numpy花式索引 [英] Fast numpy fancy indexing

查看:770
本文介绍了快速的numpy花式索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的切片numpy数组的代码(通过花哨的索引)非常慢。它目前是计划的瓶颈。

My code for slicing a numpy array (via fancy indexing) is very slow. It is currently a bottleneck in program.

a.shape
(3218, 6)

ts = time.time(); a[rows][:, cols]; te = time.time(); print('%.8f' % (te-ts));
0.00200009

获取由行子集组成的数组的正确numpy调用是什么矩阵a的'rows'和列'col'? (事实上​​,我需要对此结果进行转置)

What is the correct numpy call to get an array consisting of the subset of rows 'rows' and columns 'col' of the matrix a? (in fact, I need the transpose of this result)

推荐答案

如果使用花式索引进行切片,可以加快速度广播:

You can get some speed up if you slice using fancy indexing and broadcasting:

from __future__ import division
import numpy as np

def slice_1(a, rs, cs) :
    return a[rs][:, cs]

def slice_2(a, rs, cs) :
    return a[rs[:, None], cs]

>>> rows, cols = 3218, 6
>>> rs = np.unique(np.random.randint(0, rows, size=(rows//2,)))
>>> cs = np.unique(np.random.randint(0, cols, size=(cols//2,)))
>>> a = np.random.rand(rows, cols)
>>> import timeit
>>> print timeit.timeit('slice_1(a, rs, cs)',
                        'from __main__ import slice_1, a, rs, cs',
                        number=1000)
0.24083110865
>>> print timeit.timeit('slice_2(a, rs, cs)',
                        'from __main__ import slice_2, a, rs, cs',
                        number=1000)
0.206566124519

如果按照百分比来考虑,做一些比15%快的事总是好的,但在我的系统中,对于这个尺寸来说对你的阵列来说,这需要花费40美元才能进行切片,而且很难相信采用240 us的操作将成为你的瓶颈。

If you think in term of percentages, doing something 15% faster is always good, but in my system, for the size of your array, this is taking 40 us less to do the slicing, and it is hard to believe that an operation taking 240 us will be your bottleneck.

这篇关于快速的numpy花式索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆