numpy 数组上的向量化操作 [英] Vectorizing operation on numpy array

查看:68
本文介绍了numpy 数组上的向量化操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多三维 numpy 数组的 numpy 数组,其中每个子元素都是一个灰度图像.我想使用 numpy 的 vectorize 来应用仿射变换到数组中的每个图像.

I have a numpy array containing many three-dimensional numpy arrays, where each of these sub-elements is a grayscale image. I want to use numpy's vectorize to apply an affine transformation to each image in the array.

这是重现问题的最小示例:

Here is a minimal example that reproduces the issue:

import cv2
import numpy as np
from functools import partial

# create four blank images
data = np.zeros((4, 1, 96, 96), dtype=np.uint8)

M = np.array([[1, 0, 0], [0, 1, 0]], dtype=np.float32) # dummy affine transformation matrix
size = (96, 96) # output image size

现在我想将数据中的每个图像传递给 cv2.warpAffine(src, M, dsize).在对它进行矢量化之前,我首先创建了一个绑定 M 和 dsize 的偏函数:

Now I want to pass each of the images in data to cv2.warpAffine(src, M, dsize). Before I vectorize it, I first create a partial function that binds M and dsize:

warpAffine = lambda M, size, img : cv2.warpAffine(img, M, size) # re-order function parameters
partialWarpAffine = partial(warpAffine, M, size)

vectorizedWarpAffine = np.vectorize(partialWarpAffine)
print data[:, 0].shape # prints (4, 96, 96)
vectorizedWarpAffine(data[:, 0])

但是这个输出:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1573, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1633, in _vectorize_call
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
  File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1597, in _get_ufunc_and_otypes
    outputs = func(*inputs)
  File "<stdin>", line 1, in <lambda>
TypeError: src is not a numpy array, neither a scalar

我做错了什么 - 为什么我不能对 numpy 数组进行矢量化操作?

What am I doing wrong - why can't I vectorize an operation on numpy arrays?

推荐答案

问题在于,仅仅使用 partial 并不会因为 partial 而使其他参数的存在消失代码>矢量化.partial 对象的基础函数将是 vectorizedWarpAffine.pyfunc,它将跟踪您在调用 vectorizedWarpAffine 时希望它使用的任何预绑定参数.pyfunc.func(仍然是一个多参数函数).

The problem is that just by using partial it doesn't make the existence of the other arguments go away for the sake of vectorize. The function underlying the partial object will be vectorizedWarpAffine.pyfunc, which will keep track of whatever pre-bound arguments you'd like it to use when calling vectorizedWarpAffine.pyfunc.func (which is still a multi-argumented function).

你可以这样看到(在你import inspect之后):

You can see it like this (after you import inspect):

In [19]: inspect.getargspec(vectorizedWarpAffine.pyfunc.func)
Out[19]: ArgSpec(args=['M', 'size', 'img'], varargs=None, keywords=None, defaults=None)

为了解决这个问题,您可以使用 np.vectorizeexcluded 选项,它表示在包装矢量化行为时要忽略哪些参数(位置或关键字):

To get around this, you can use the excluded option to np.vectorize which says which arguments (positonal or keyword) to ignore when wrapping the vectorization behavior:

vectorizedWarpAffine = np.vectorize(partialWarpAffine, 
                                    excluded=set((0, 1)))

当我进行此更改时,代码现在似乎实际执行了矢量化函数,但它在 imagewarp.cpp 代码中遇到了实际错误,大概是由于此测试中的一些错误数据假设数据:

When I make this change, the code appears to actually execute the vectorized function now, but it hits an actual error in the imagewarp.cpp code, presumably due to some bad data assumption on this test data:

In [21]: vectorizedWarpAffine(data[:, 0])
OpenCV Error: Assertion failed (cn <= 4 && ssize.area() > 0) in remapBilinear, file -------src-dir-------/opencv-2.4.6.1/modules/imgproc/src/imgwarp.cpp, line 2296
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-21-3fb586393b75> in <module>()
----> 1 vectorizedWarpAffine(data[:, 0])

/home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in __call__(self, *args, **kwargs)
   1570             vargs.extend([kwargs[_n] for _n in names])
   1571 
-> 1572         return self._vectorize_call(func=func, args=vargs)
   1573 
   1574     def _get_ufunc_and_otypes(self, func, args):

/home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in _vectorize_call(self, func, args)
   1628         """Vectorized call to `func` over positional `args`."""
   1629         if not args:
-> 1630             _res = func()
   1631         else:
   1632             ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)

/home/ely/anaconda/lib/python2.7/site-packages/numpy/lib/function_base.pyc in func(*vargs)
   1565                     the_args[_i] = vargs[_n]
   1566                 kwargs.update(zip(names, vargs[len(inds):]))
-> 1567                 return self.pyfunc(*the_args, **kwargs)
   1568 
   1569             vargs = [args[_i] for _i in inds]

/home/ely/programming/np_vect.py in <lambda>(M, size, img)
     10 size = (96, 96) # output image size
     11 
---> 12 warpAffine = lambda M, size, img : cv2.warpAffine(img, M, size) # re-order function parameters
     13 partialWarpAffine = partial(warpAffine, M, size)
     14 

error: -------src-dir-------/opencv-2.4.6.1/modules/imgproc/src/imgwarp.cpp:2296: error: (-215) cn <= 4 && ssize.area() > 0 in function remapBilinear

附带说明:我看到您的数据的形状为 (4, 96, 96)不是 (4, 10, 10).

As a side note: I am seeing a shape of (4, 96, 96) for your data, not (4, 10, 10).

另请注意,使用np.vectorize 不是提高函数性能的技术.它所做的只是将您的函数调用轻轻地包裹在一个表面的 for 循环中(尽管是在 NumPy 级别).它是一种用于编写自动遵守 NumPy 广播规则的函数并使您的 API 表面上类似于 NumPy 的 API 的技术,从而期望函数调用能够在 ndarray 参数之上正确工作.

Also note that using np.vectorize is not a technique for improving the performance of a function. All it does is gently wrap your function call inside a superficial for-loop (albeit at the NumPy level). It is a technique for writing functions that automatically adhere to NumPy broadcasting rules and for making your API superficially similar to NumPy's API, whereby function calls are expected to work correctly on top of ndarray arguments.

查看此帖子了解更多详情.

添加:在这种情况下,您使用 partial 的主要原因是为了获得一个表面上是单参数"的新函数,但实际上并没有根据 partial 的工作方式进行规划.那么为什么不一起去掉 partial 呢?

Added: The main reason you are using partial in this case is to get a new function that's ostensibly "single-argumented" but that doesn't work out as planned based on the way partial works. So why not just get rid of partial all together?

您可以让 lambda 函数保持原样,即使有两个非数组位置参数,但仍要确保将第三个参数 视为矢量化.为此,您只需使用 excluded 如上所述,但您还需要告诉 vectorize 期望的输出内容.

You can leave your lambda function exactly as it is, even with the two non-array positional arguments, but still ensure that the third argument is treated as something to vectorize over. To do this, you just use excluded as above, but you also need to tell vectorize what to expect as the output.

这样做的原因是 vectorize 将尝试通过在您提供的数据的第一个元素上运行您的函数来确定输出形状应该是什么.在这种情况下(我不完全确定,值得进行更多调试)这似乎会创建您看到的src 不是 numpy 数组"错误.

The reason for this is that vectorize will try to determine what the output shape is supposed to be by running your function on the first element of the data you supply. In this case (and I am not fully sure and it would be worth more debugging) this seems to create the "src is not numpy array" error you were seeing.

所以为了防止 vectorize 甚至尝试它,您可以自己提供输出类型的列表,如下所示:

So to prevent vectorize from even trying it, you can provide a list of the output types yourself, like this:

vectorizedWarpAffine = np.vectorize(warpAffine, 
                                    excluded=(0, 1), 
                                    otypes=[np.ndarray])

它有效:

In [29]: vectorizedWarpAffine(M, size, data[:, 0])
Out[29]: 
array([[[ array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       ...

我认为这更好,因为现在当你调用 vectorizedWarpAffine 时,你仍然明确地使用其他位置参数,而不是使用 partial<预先绑定的误导层/code>,但第三个参数仍然被向量处理.

I think this is a lot nicer because now when you call vectorizedWarpAffine you still explicitly utilize the other positional arguments, instead of the layer of misdirection where they are pre-bound with partial, and yet the third argument is still treated vectorially.

这篇关于numpy 数组上的向量化操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆