cython中的np.int,np.int_,int和np.int_t之间的区别? [英] Difference between np.int, np.int_, int, and np.int_t in cython?
问题描述
在cython中,要处理这么多的int
数据类型我感到有些困惑.
np.int, np.int_, np.int_t, int
我猜纯Python中的int
等效于np.int_
,那么np.int
是从哪里来的呢?我无法从numpy中找到文档?另外,为什么np.int_
存在,因为我们已经有了int
?
在cython中,我猜想int
用作cdef int
或ndarray[int]
时会变成C类型,而当用作int()
时它仍保留为python caster吗?
np.int_
是否等效于C中的long
?所以cdef long
与cdef np.int_
相同吗?
在什么情况下应该使用np.int_t
代替np.int
?例如cdef np.int_t
,ndarray[np.int_t]
...
有人可以简要解释一下这些类型的错误使用将如何影响编译后的cython代码的性能吗?
这有点复杂,因为名称根据上下文具有不同的含义.
int
-
在Python中
int
通常只是Python类型,具有任意精度,这意味着您可以在其中存储任何可能的整数(只要有足够的内存即可).>>> int(10**50) 100000000000000000000000000000000000000000000000000
-
但是,当将其用作NumPy数组的
dtype
时,它将被解释为np.int_
1 . 不是具有任意精度,它将具有与C的long
相同的大小:>>> np.array(10**50, dtype=int) OverflowError: Python int too large to convert to C long
这也意味着以下两个是等效的:
np.array([1,2,3], dtype=int) np.array([1,2,3], dtype=np.int_)
-
作为Cython类型标识符,它还有另一种含义,在这里代表 c 键入
定义变量时int
.它的精度有限(通常为32位).您可以将其用作Cython类型,例如,在使用cdef
:cdef int value = 100 # variable cdef int[:] arr = ... # memoryview
作为
cdef
或cpdef
函数的返回值或参数值:cdef int my_function(int argument1, int argument2): # ...
作为
ndarray
的通用":cimport numpy as cnp cdef cnp.ndarray[int, ndim=1] val = ...
对于类型转换:
avalue = <int>(another_value)
也许还有更多.
-
在Cython中,但作为Python类型.您仍然可以调用
int
并获得一个"Python int"(任意精度),或者将其用于isinstance
或作为dtype
参数用于np.array
.这里的上下文很重要,因此转换为Pythonint
与转换为C int是不同的:cdef object val = int(10) # Python int cdef int val = <int>(10) # C int
np.int
实际上,这很容易.它只是int
的别名:
>>> int is np.int
True
因此,以上所有内容也适用于np.int
.但是,除非在cimport
ed软件包上使用它,否则不能将其用作类型标识符.在这种情况下,它表示Python整数类型.
cimport numpy as cnp
cpdef func(cnp.int obj):
return obj
这将期望obj
是Python整数不是NumPy类型:
>>> func(np.int_(10))
TypeError: Argument 'obj' has incorrect type (expected int, got numpy.int32)
>>> func(10)
10
我对np.int
的建议:尽可能避免使用它.在Python代码中,它等效于int
,在Cython代码中,其等效于Pythons int
,但是如果用作类型标识符,它可能会使您和所有阅读代码的人感到困惑!这肯定让我感到困惑...
np.int_
实际上,它只有一个含义:它是表示标量NumPy类型的 Python类型.您可以像Pythons int
:
>>> np.int_(10) # looks like a normal Python integer
10
>>> type(np.int_(10)) # but isn't (output may vary depending on your system!)
numpy.int32
或者您用它来指定dtype
,例如使用np.array
:
>>> np.array([1,2,3], dtype=np.int_)
array([1, 2, 3])
但是您不能将其用作Cython中的类型标识符.
cnp.int_t
这是np.int_
的类型标识符版本.这意味着您不能将其用作dtype参数.但是您可以将其用作cdef
声明的类型:
cimport numpy as cnp
import numpy as np
cdef cnp.int_t[:] arr = np.array([1,2,3], dtype=np.int_)
|---TYPE---| |---DTYPE---|
此示例(希望如此)显示,带有结尾_t
的类型标识符实际上使用 dtype 表示了数组的类型,而没有结尾t
.您无法在Cython代码中互换它们!
注释
NumPy中还有其他几种数字类型,我将包括一个列表,其中包含NumPy dtype和Cython类型标识符以及也可以在Cython中使用的C类型标识符.但这基本上取自 NumPy文档和 Cython NumPy
NumPy dtype Numpy Cython type C Cython type identifier
np.bool_ None None
np.int_ cnp.int_t long
np.intc None int
np.intp cnp.intp_t ssize_t
np.int8 cnp.int8_t signed char
np.int16 cnp.int16_t signed short
np.int32 cnp.int32_t signed int
np.int64 cnp.int64_t signed long long
np.uint8 cnp.uint8_t unsigned char
np.uint16 cnp.uint16_t unsigned short
np.uint32 cnp.uint32_t unsigned int
np.uint64 cnp.uint64_t unsigned long
np.float_ cnp.float64_t double
np.float32 cnp.float32_t float
np.float64 cnp.float64_t double
np.complex_ cnp.complex128_t double complex
np.complex64 cnp.complex64_t float complex
np.complex128 cnp.complex128_t double complex
实际上,np.bool_
有Cython类型:cnp.npy_bool
和bint
,但它们当前都不能用于NumPy数组.对于标量,cnp.npy_bool
将只是一个无符号整数,而bint
将是一个布尔值.不确定那里发生了什么...
内置Python类型
当用于生成dtype对象时,几种python类型等效于相应的数组标量:
int np.int_ bool np.bool_ float np.float_ complex np.cfloat bytes np.bytes_ str np.bytes_ (Python2) or np.unicode_ (Python3) unicode np.unicode_ buffer np.void (all others) np.object_
I am a bit struggled with so many int
data types in cython.
np.int, np.int_, np.int_t, int
I guess int
in pure python is equivalent to np.int_
, then where does np.int
come from? I cannot find the document from numpy? Also, why does np.int_
exist given we do already have int
?
In cython, I guess int
becomes a C type when used as cdef int
or ndarray[int]
, and when used as int()
it stays as the python caster?
Is np.int_
equivalent to long
in C? so cdef long
is the identical to cdef np.int_
?
Under what circumstances should I use np.int_t
instead of np.int
? e.g. cdef np.int_t
, ndarray[np.int_t]
...
Can someone briefly explain how the wrong use of those types would affect the performance of compiled cython code?
It's a bit complicated because the names have different meanings depending on the context.
int
In Python
The
int
is normally just a Python type, it's of arbitrary precision, meaning that you can store any conceivable integer inside it (as long as you have enough memory).>>> int(10**50) 100000000000000000000000000000000000000000000000000
However, when you use it as
dtype
for a NumPy array it will be interpreted asnp.int_
1. Which is not of arbitrary precision, it will have the same size as C'slong
:>>> np.array(10**50, dtype=int) OverflowError: Python int too large to convert to C long
That also means the following two are equivalent:
np.array([1,2,3], dtype=int) np.array([1,2,3], dtype=np.int_)
As Cython type identifier it has another meaning, here it stands for the c type
int
. It's of limited precision (typically 32bits). You can use it as Cython type, for example when defining variables withcdef
:cdef int value = 100 # variable cdef int[:] arr = ... # memoryview
As return value or argument value for
cdef
orcpdef
functions:cdef int my_function(int argument1, int argument2): # ...
As "generic" for
ndarray
:cimport numpy as cnp cdef cnp.ndarray[int, ndim=1] val = ...
For type casting:
avalue = <int>(another_value)
And probably many more.
In Cython but as Python type. You can still call
int
and you'll get a "Python int" (of arbitrary precision), or use it forisinstance
or asdtype
argument fornp.array
. Here the context is important, so converting to a Pythonint
is different from converting to a C int:cdef object val = int(10) # Python int cdef int val = <int>(10) # C int
np.int
Actually this is very easy. It's just an alias for int
:
>>> int is np.int
True
So everything from above applies to np.int
as well. However you can't use it as a type-identifier except when you use it on the cimport
ed package. In that case it represents the Python integer type.
cimport numpy as cnp
cpdef func(cnp.int obj):
return obj
This will expect obj
to be a Python integer not a NumPy type:
>>> func(np.int_(10))
TypeError: Argument 'obj' has incorrect type (expected int, got numpy.int32)
>>> func(10)
10
My advise regarding np.int
: Avoid it whenever possible. In Python code it's equivalent to int
and in Cython code it's also equivalent to Pythons int
but if used as type-identifier it will probably confuse you and everyone who reads the code! It certainly confused me...
np.int_
Actually it only has one meaning: It's a Python type that represents a scalar NumPy type. You use it like Pythons int
:
>>> np.int_(10) # looks like a normal Python integer
10
>>> type(np.int_(10)) # but isn't (output may vary depending on your system!)
numpy.int32
Or you use it to specify the dtype
, for example with np.array
:
>>> np.array([1,2,3], dtype=np.int_)
array([1, 2, 3])
But you cannot use it as type-identifier in Cython.
cnp.int_t
It's the type-identifier version for np.int_
. That means you can't use it as dtype argument. But you can use it as type for cdef
declarations:
cimport numpy as cnp
import numpy as np
cdef cnp.int_t[:] arr = np.array([1,2,3], dtype=np.int_)
|---TYPE---| |---DTYPE---|
This example (hopefully) shows that the type-identifier with the trailing _t
actually represents the type of an array using the dtype without the trailing t
. You can't interchange them in Cython code!
Notes
There are several more numeric types in NumPy I'll include a list containing the NumPy dtype and Cython type-identifier and the C type identifier that could also be used in Cython here. But it's basically taken from the NumPy documentation and the Cython NumPy pxd
file:
NumPy dtype Numpy Cython type C Cython type identifier
np.bool_ None None
np.int_ cnp.int_t long
np.intc None int
np.intp cnp.intp_t ssize_t
np.int8 cnp.int8_t signed char
np.int16 cnp.int16_t signed short
np.int32 cnp.int32_t signed int
np.int64 cnp.int64_t signed long long
np.uint8 cnp.uint8_t unsigned char
np.uint16 cnp.uint16_t unsigned short
np.uint32 cnp.uint32_t unsigned int
np.uint64 cnp.uint64_t unsigned long
np.float_ cnp.float64_t double
np.float32 cnp.float32_t float
np.float64 cnp.float64_t double
np.complex_ cnp.complex128_t double complex
np.complex64 cnp.complex64_t float complex
np.complex128 cnp.complex128_t double complex
Actually there are Cython types for np.bool_
: cnp.npy_bool
and bint
but both they can't be used for NumPy arrays currently. For scalars cnp.npy_bool
will just be an unsigned integer while bint
will be a boolean. Not sure what's going on there...
1 Taken From the NumPy documentation "Data type objects"
Built-in Python types
Several python types are equivalent to a corresponding array scalar when used to generate a dtype object:
int np.int_ bool np.bool_ float np.float_ complex np.cfloat bytes np.bytes_ str np.bytes_ (Python2) or np.unicode_ (Python3) unicode np.unicode_ buffer np.void (all others) np.object_
这篇关于cython中的np.int,np.int_,int和np.int_t之间的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!