Numpy:初学者 [英] Numpy: Beginner nditer

查看:140
本文介绍了Numpy:初学者的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习 nditer 可能用于加速我的应用程序。在这里,我尝试制作一个小型的重塑程序,它将采用20号阵列并将其重塑为5x4数组:

I am trying to learn nditer for possible use in speeding up my application. Here, i try to make a facetious reshape program that will take a size 20 array and reshape it to a 5x4 array:

myArray = np.arange(20)
def fi_by_fo_100(array):
    offset = np.array([0, 4, 8, 12, 16])
    it = np.nditer([offset, None],
                      flags=['reduce_ok'],
                      op_flags=[['readonly'],
                                ['readwrite','allocate']],
                      op_axes=[None, [0,1,-1]],
                      itershape=(-1, 4, offset.size))

    while not it.finished:
        indices = np.arange(it[0],(it[0]+4), dtype=int)
        info = array.take(indices)
        '''Just for fun, we'll perform an operation on data.\
           Let's shift it to 100'''
        info = info + 81
        it.operands[1][...]=info
        it.iternext()
    return it.operands[1]

test = fi_by_fo_100(myArray)
>>> test
array([[ 97,  98,  99, 100]])

显然程序将每个结果重写为一行。所以我尝试使用nditer的索引功能,但仍然没有骰子。

Clearly the program is overwriting each result into one row. So i try using the indexing functionality of nditer, but still no dice.

flags = ['reduce_ok','c_iter'] - > it.operands [1] [it.index] [...] = info =
IndexError: index out of bounds

flags=['reduce_ok','c_iter'] --> it.operands[1][it.index][...]=info =
IndexError: index out of bounds

flags = ['reduce_ok','c_iter'] - > it.operands [1] [it.iterindex] [...] = info =
IndexError:index out of bounds

flags = ['reduce_ok','multi_iter'] - > it.operands [1] [it.multi_index] [...] = info =
IndexError:index out of bounds

flags=['reduce_ok','multi_iter'] --> it.operands[1][it.multi_index][...]=info =
IndexError: index out of bounds

it [0] [it.multi_index [1]] [...] = info =

IndexError:0-d数组无法编入索引

...等等。我错过了什么?提前致谢。

...and so on. What am i missing? Thanks in advance.

我刚发生在这篇关于nditer的好文章。我可能是Numpy的新手,但这是我第一次看到Numpy速度基准测试远远落后。我的理解是人们选择Numpy的数字速度和实力,但是迭代是其中的一部分,不是吗?如果它如此缓慢,那么nditer有什么意义?

I just happened across this nice article on nditer. I may be new to Numpy, but this is the first time i've seen Numpy speed benchmarks this far behind. It's my understanding that people choose Numpy for it's numerical speed and prowess, but iteration is a part of that, no? What is the point of nditer if it's so slow?

推荐答案

通过打印出正在发生的事情来解决问题真的很有帮助方式。

It really helps to break things down by printing out what's going on along the way.

首先,让我们用这个替换你的整个循环:

First, let's replace your whole loop with this:

i = 0
while not it.finished:
    i += 1
print i

它会打印20,而不是5.那是因为你正在进行5x4迭代,而不是5x1。

It'll print 20, not 5. That's because you're doing a 5x4 iteration, not 5x1.

所以,为什么这是偶数接近工作?好吧,让我们更仔细地看一下循环:

So, why is this even close to working? Well, let's look at the loop more carefully:

while not it.finished:
    print '>', it.operands[0], it[0]
    indices = np.arange(it[0],(it[0]+4), dtype=int)
    info = array.take(indices)
    info = info + 81
    it.operands[1][...]=info
    print '<', it.operands[1], it[1]

你会看到前五个循环通过 [0 4 8 12 16 ] 五次,产生 [[81 82 83 84]] ,然后 [[85 86 87 88]] 等等然后接下来的五个循环做同样的事情,一次又一次。

You'll see that the first five loops go through [0 4 8 12 16] five times, generating [[81 82 83 84]], then [[85 86 87 88]], etc. And then the next five loops do the same thing, and again and again.

这也是你的 c_index 解决方案不起作用 - 因为 it.index 的范围是0到19,而你没有任何20在 it.operands [1]

This is also why your c_index solutions didn't work—because it.index is going to range from 0 to 19, and you don't have 20 of anything in it.operands[1].

如果您执行了multi_index并忽略了列,则可以这项工作......但是,你仍然会做5x4迭代,只是为了重复每步4次,而不是进行你想要的5x1迭代。

If you did the multi_index right and ignored the columns, you could make this work… but still, you'd be doing a 5x4 iteration, just to repeat each step 4 times, instead of doing the 5x1 iteration you want.

你的 it.operands [1] [...] = info 每次循环时用5x1行替换整个输出。一般来说,你不应该对 it.operands [1] 做任何事情 - nditer 的全部内容是你只需照顾每个它[1] ,最后的 it.operands [1] 是结果。

Your it.operands[1][...]=info replaces the entire output with a 5x1 row each time through the loop. Generally, you shouldn't ever have to do anything to it.operands[1]—the whole point of nditer is that you just take care of each it[1], and the final it.operands[1] is the result.

当然,对行进行5x4迭代是没有意义的。要么对单个值进行5x4迭代,要么对行进行5x1迭代。

Of course a 5x4 iteration over rows makes no sense. Either do a 5x4 iteration over individual values, or a 5x1 iteration over rows.

如果你想要前者,最简单的方法就是重塑输入数组,然后只是迭代:

If you want the former, the easiest way to do it is to reshape the input array, then just iterate that:

it = np.nditer([array.reshape(5, -1), None],
               op_flags=[['readonly'],
                         ['readwrite','allocate']])
for a, b in it:
    b[...] = a + 81
return it.operands[1]

但当然这很愚蠢 - 它只是一个更慢更复杂的写作方式:

But of course that's silly—it's just a slower and more complicated way of writing:

return array+81

建议编写自己的重塑的方法是先打电话有点傻重塑,然后......

And it would be a bit silly to suggest that "the way to write your own reshape is to first call reshape, and then…"

所以,你想迭代行,对吗?

So, you want to iterate over rows, right?

让我们通过摆脱 allocate 并显式创建一个5x4数组来简化一些事情with:

Let's simplify things a bit by getting rid of the allocate and explicitly creating a 5x4 array to start with:

outarray = np.zeros((5,4), dtype=array.dtype)
offset = np.array([0, 4, 8, 12, 16])
it = np.nditer([offset, outarray],
               flags=['reduce_ok'],
               op_flags=[['readonly'],
                         ['readwrite']],
               op_axes=[None, [0]],
               itershape=[5])

while not it.finished:
    indices = np.arange(it[0],(it[0]+4), dtype=int)
    info = array.take(indices)
    '''Just for fun, we'll perform an operation on data.\
       Let's shift it to 100'''
    info = info + 81
    it.operands[1][it.index][...]=info
    it.iternext()
return it.operands[1]

这有点滥用 nditer ,但至少它做对了。

This is a bit of an abuse of nditer, but at least it does the right thing.

因为你只是对源进行一次迭代而基本上忽略了第二次,所以真的没有在这里使用 nditer 的充分理由。如果你需要对多个数组进行锁步迭代,那么表示a,b in nditer([x,y],...)比迭代更干净x 并使用索引访问 y -just如表示a,b in zip(x,y) numpy 之外。如果你需要迭代多维数组, nditer 通常比替代品更干净。但是在这里,你所做的只是迭代 [0,4,8,16,20] ,对结果做一些事情,并将其复制到另一个 array

Since you're just doing a 1D iteration over the source and basically ignoring the second, there's really no good reason to use nditer here. If you need to do lockstep iteration over multiple arrays, for a, b in nditer([x, y], …) is cleaner than iterating over x and using the index to access y—just like for a, b in zip(x, y) outside of numpy. And if you need to iterate over multi-dimensional arrays, nditer is usually cleaner than the alternatives. But here, all you're really doing is iterating over [0, 4, 8, 16, 20], doing something with the result, and copying it into another array.

另外,正如我在评论中提到的,如果你发现自己在中使用迭代numpy ,你经常做错事。 numpy 的所有速度优势来自让它在本机C / Fortran或低级向量操作中执行紧密循环。一旦你循环遍历数组 s,你实际上只是用一种稍微好一点的语法来做慢速Python数字:

Also, as I mentioned in the comments, if you find yourself using iteration in numpy, you're usually doing something wrong. All of the speed benefits of numpy come from letting it execute the tight loops in native C/Fortran or lower-level vector operations. Once you're looping over arrays, you're effectively just doing slow Python numerics with a slightly nicer syntax:

import numpy as np
import timeit

def add10_numpy(array):
    return array + 10

def add10_nditer(array):
    it = np.nditer([array, None], [],
                   [['readonly'], ['writeonly', 'allocate']])
    for a, b in it:
        np.add(a, 10, b)
    return it.operands[1]

def add10_py(array):
    x, y = array.shape
    outarray = array.copy()
    for i in xrange(x):
        for j in xrange(y):
            outarray[i, j] = array[i, j] + 10
    return out array

myArray = np.arange(100000).reshape(250,-1)

for f in add10_numpy, add10_nditer, add10_py:
    print '%12s: %s' % (f.__name__, timeit.timeit(lambda: f(myArray), number=1))

在我的系统上打印:

 add10_numpy: 0.000458002090454
add10_nditer: 0.292730093002
    add10_py: 0.127345085144

这显示了不必要地使用 nditer 的成本。

That shows you the cost of using nditer unnecessarily.

这篇关于Numpy:初学者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆