创建numpy数组时dtype = object是什么意思? [英] What does dtype=object mean while creating a numpy array?
问题描述
我正在尝试使用numpy数组并创建了一个numpy的字符串数组:
I was experimenting with numpy arrays and created a numpy array of strings:
ar1 = np.array(['avinash', 'jay'])
正如我从他们的官方指南中所读到的那样,对numpy数组的操作会传播到各个元素.所以我这样做了:
As I have read from from their official guide, operations on numpy array are propagated to individual elements. So I did this:
ar1 * 2
但随后出现此错误:
TypeError Traceback (most recent call last)
<ipython-input-22-aaac6331c572> in <module>()
----> 1 ar1 * 2
TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'int'
但是当我使用dtype=object
ar1 = np.array(['avinash', 'jay'], dtype=object)
在创建数组时,我可以执行所有操作.
while creating the array I am able to do all operations.
有人可以告诉我为什么会这样吗?
Can anyone tell me why this is happening?
推荐答案
NumPy数组存储为连续的内存块.它们通常只有一个数据类型(例如整数,浮点数或固定长度的字符串),然后将内存中的位解释为具有该数据类型的值.
NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.
使用dtype=object
创建数组是不同的.现在,数组占用的内存充满了指针,这些指针指向存储在内存中其他地方的Python对象(很像Python list
实际上只是一个指针列表)对象,而不是对象本身.
Creating an array with dtype=object
is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list
is really just a list of pointers to objects, not the objects themselves).
诸如*
的算术运算符不能与具有string_
数据类型的诸如ar1
的数组一起使用(取而代之的是特殊功能-参见下文). NumPy只是将内存中的位视为字符,而*
运算符在这里没有意义.但是,这行
Arithmetic operators such as *
don't work with arrays such as ar1
which have a string_
datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the *
operator doesn't make sense here. However, the line
np.array(['avinash','jay'], dtype=object) * 2
之所以起作用,是因为现在该数组是Python字符串(指向)的数组.对于这些Python字符串对象,*
运算符定义良好.在内存中创建新的Python字符串,并返回引用了新字符串的新object
数组.
works because now the array is an array of (pointers to) Python strings. The *
operator is well defined for these Python string objects. New Python strings are created in memory and a new object
array with references to the new strings is returned.
如果您有一个string_
或unicode_
dtype的数组,并且想重复每个字符串,则可以使用np.char.multiply
:
If you have an array with string_
or unicode_
dtype and want to repeat each string, you can use np.char.multiply
:
In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'],
dtype='<U14')
NumPy还有许多其他的向量化的字符串方法
NumPy has many other vectorised string methods too.
这篇关于创建numpy数组时dtype = object是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!