Cython:numpy数组的无符号int索引给出不同的结果 [英] Cython: unsigned int indices for numpy arrays gives different result

查看:64
本文介绍了Cython:numpy数组的无符号int索引给出不同的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过添加一些类型并进行编译将python函数转换为cython. 我在python和cython函数的结果之间得到了很小的数值差异. 经过一些工作后,我发现区别在于使用unsigned int而不是int访问numpy数组.

I converted to cython a python function by just adding some types and compiling it. I was getting small numerical differences between the results of the python and cython functions. After some work I found that the differences came from accessing a numpy array using unsigned int instead of int.

我根据以下条件使用了无符号的int索引来加快访问速度: http://docs.cython.org/src/userguide/numpy_tutorial.html#tuning-indexing-further

I was using unsigned int indices to speed up access according to: http://docs.cython.org/src/userguide/numpy_tutorial.html#tuning-indexing-further

无论如何,我认为使用无符号整数是无害的.

anyway I thought it was harmless to use unsigned ints.

查看此代码:

cpdef function(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef unsigned int x, y   
    x, y = int(max_loc[0]), int(max_loc[1])
    x2, y2 = int(max_loc[0]), int(max_loc[1])
    print response[y,x], type(response[y,x]), response.dtype
    print response[y2,x2], type(response[y2,x2]), response.dtype   
    print 2*(response[y,x] - min(response[y,x-1], response[y,x+1]))
    print 2*(response[y2,x2] - min(response[y2,x2-1], response[y2,x2+1]))  

打印:

0.959878861904 <type 'float'> float32
0.959879 <type 'numpy.float32'> float32
1.04306024313
1.04306030273   

为什么会这样?!!!是虫子吗?

Why does this happen?!!! is it a bug?

好的,这里要求的是SSCCE,其类型和值与我在原始函数中使用的类型和值相同

Ok, as requested here is a SSCCE with the same types and values that I used in my original function

cpdef function():
    cdef unsigned int x, y  
    max_loc2 = np.asarray([ 15., 25.], dtype=float) 
    cdef np.ndarray[np.float32_t, ndim=2] response2 = np.zeros((49,49), dtype=np.float32)    
    x, y = int(max_loc2[0]), int(max_loc2[1])
    x2, y2 = int(max_loc2[0]), int(max_loc2[1])

    response2[y,x] = 0.959878861904  
    response2[y,x-1] = 0.438348740339
    response2[y,x+1] = 0.753262758255  


    print response2[y,x], type(response2[y,x]), response2.dtype
    print response2[y2,x2], type(response2[y2,x2]), response2.dtype
    print 2*(response2[y,x] - min(response2[y,x-1], response2[y,x+1]))
    print 2*(response2[y2,x2] - min(response2[y2,x2-1], response2[y2,x2+1]))  

打印

0.959878861904 <type 'float'> float32
0.959879 <type 'numpy.float32'> float32
1.04306024313
1.04306030273

我使用python 2.7.3 cython 0.18和msvc9 express

I use python 2.7.3 cython 0.18 and msvc9 express

推荐答案

我修改了问题中的示例,以使其更易于阅读为模块生成的C源代码.我只希望看到创建Python float对象的逻辑,而不是从response数组中获取np.float32对象.

I modified the example in the question to make it simpler to read the generated C source for the module. I'm only interested in seeing the logic that creates Python float objects instead of getting np.float32 objects from the response array.

我正在使用pyximport编译扩展模块.它将生成的C文件保存在~/.pyxbld(在Windows中可能是%userprofile%\.pyxbld)的子目录中.

I'm using pyximport to compile the extension module. It saves the generated C file in a subdirectory of ~/.pyxbld (probably %userprofile%\.pyxbld on Windows).

import numpy as np
import pyximport
pyximport.install(setup_args={'include_dirs': [np.get_include()]})

open('_tmp.pyx', 'w').write('''
cimport numpy as np
cpdef function(np.ndarray[np.float32_t, ndim=2] response, max_loc):
    cdef unsigned int p_one, q_one
    p_one = int(max_loc[0])
    q_one = int(max_loc[1])
    p_two = int(max_loc[0])
    q_two = int(max_loc[1])
    r_one = response[q_one, p_one]
    r_two = response[q_two, p_two]
''')

import _tmp
assert(hasattr(_tmp, 'function'))

这是感兴趣的部分生成的C代码(经过重新格式化以使其更易于阅读).事实证明,当您使用C unsigned int索引变量时,生成的代码直接从数组缓冲区中获取数据并调用PyFloat_FromDouble,将其强制为double.另一方面,当您使用Python int索引变量时,它将采用通用方法.它形成一个元组并调用PyObject_GetItem.这样,ndarray可以正确使用np.float32 dtype.

Here's the generated C code for the section of interest (a bit reformatted to make it easier to read). It turns out that when you use C unsigned int index variables, the generated code grabs the data directly from the array buffer and calls PyFloat_FromDouble, which coerces it to double. On the other hand, when you use Python int index variables, it takes the generic approach. It forms a tuple and calls PyObject_GetItem. This way allows the ndarray to correctly honor the np.float32 dtype.

#define __Pyx_BufPtrStrided2d(type, buf, i0, s0, i1, s1) \
    (type)((char*)buf + i0 * s0 + i1 * s1)

  /* "_tmp.pyx":9
 *     p_two = int(max_loc[0])
 *     q_two = int(max_loc[1])
 *     r_one = response[q_one, p_one]             # <<<<<<<<<<<<<<
 *     r_two = response[q_two, p_two]
 */
  __pyx_t_3 = __pyx_v_q_one;
  __pyx_t_4 = __pyx_v_p_one;
  __pyx_t_5 = -1;

  if (unlikely(__pyx_t_3 >= (size_t)__pyx_bshape_0_response))
    __pyx_t_5 = 0;
  if (unlikely(__pyx_t_4 >= (size_t)__pyx_bshape_1_response))
    __pyx_t_5 = 1;

  if (unlikely(__pyx_t_5 != -1)) {
    __Pyx_RaiseBufferIndexError(__pyx_t_5);
    {
      __pyx_filename = __pyx_f[0];
      __pyx_lineno = 9;
      __pyx_clineno = __LINE__;
      goto __pyx_L1_error;
    }
  }

  __pyx_t_1 = PyFloat_FromDouble((
    *__Pyx_BufPtrStrided2d(
      __pyx_t_5numpy_float32_t *,
      __pyx_bstruct_response.buf,
      __pyx_t_3, __pyx_bstride_0_response,
      __pyx_t_4, __pyx_bstride_1_response)));

  if (unlikely(!__pyx_t_1)) {
    __pyx_filename = __pyx_f[0];
    __pyx_lineno = 9;
    __pyx_clineno = __LINE__;
    goto __pyx_L1_error;
  }

  __Pyx_GOTREF(__pyx_t_1);
  __pyx_v_r_one = __pyx_t_1;
  __pyx_t_1 = 0;

  /* "_tmp.pyx":10
 *     q_two = int(max_loc[1])
 *     r_one = response[q_one, p_one]
 *     r_two = response[q_two, p_two]             # <<<<<<<<<<<<<<
 */
  __pyx_t_1 = PyTuple_New(2);

  if (unlikely(!__pyx_t_1)) {
    __pyx_filename = __pyx_f[0];
    __pyx_lineno = 10;
    __pyx_clineno = __LINE__;
    goto __pyx_L1_error;
  }

  __Pyx_GOTREF(((PyObject *)__pyx_t_1));
  __Pyx_INCREF(__pyx_v_q_two);
  PyTuple_SET_ITEM(__pyx_t_1, 0, __pyx_v_q_two);
  __Pyx_GIVEREF(__pyx_v_q_two);
  __Pyx_INCREF(__pyx_v_p_two);
  PyTuple_SET_ITEM(__pyx_t_1, 1, __pyx_v_p_two);
  __Pyx_GIVEREF(__pyx_v_p_two);

  __pyx_t_2 = PyObject_GetItem(
    ((PyObject *)__pyx_v_response),
    ((PyObject *)__pyx_t_1));

  if (!__pyx_t_2) {
    __pyx_filename = __pyx_f[0];
    __pyx_lineno = 10;
    __pyx_clineno = __LINE__;
    goto __pyx_L1_error;
  }

  __Pyx_GOTREF(__pyx_t_2);
  __Pyx_DECREF(((PyObject *)__pyx_t_1));
  __pyx_t_1 = 0;
  __pyx_v_r_two = __pyx_t_2;
  __pyx_t_2 = 0;

这篇关于Cython:numpy数组的无符号int索引给出不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆