创建 numpy 数组时 dtype=object 是什么意思? [英] What does dtype=object mean while creating a numpy array?

查看:142
本文介绍了创建 numpy 数组时 dtype=object 是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在试验 numpy 数组并创建了一个 numpy 字符串数组:

I was experimenting with numpy arrays and created a numpy array of strings:

ar1 = np.array(['avinash', 'jay'])

正如我从他们的官方指南中读到的,对 numpy 数组的操作会传播到单个元素.所以我这样做了:

As I have read from from their official guide, operations on numpy array are propagated to individual elements. So I did this:

ar1 * 2

但后来我收到此错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-22-aaac6331c572> in <module>()
----> 1 ar1 * 2

TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'int'

但是当我使用 dtype=object

ar1 = np.array(['avinash', 'jay'], dtype=object)

在创建数组时,我可以执行所有操作.

while creating the array I am able to do all operations.

谁能告诉我为什么会这样?

Can anyone tell me why this is happening?

推荐答案

NumPy 数组存储为连续的内存块.它们通常具有单一数据类型(例如整数、浮点数或固定长度的字符串),然后内存中的位被解释为具有该数据类型的值.

NumPy arrays are stored as contiguous blocks of memory. They usually have a single datatype (e.g. integers, floats or fixed-length strings) and then the bits in memory are interpreted as values with that datatype.

使用 dtype=object 创建数组是不同的.数组占用的内存现在充满了指针,这些指针指向存储在内存中其他地方的 Python 对象(很像 Python list 实际上是只是指向对象的指针列表,而不是对象本身).

Creating an array with dtype=object is different. The memory taken by the array now is filled with pointers to Python objects which are being stored elsewhere in memory (much like a Python list is really just a list of pointers to objects, not the objects themselves).

* 等算术运算符不适用于 ar1 等具有 string_ 数据类型的数组(有特殊函数代替 -见下文).NumPy 只是将内存中的位视为字符,而 * 运算符在这里没有意义.但是,该行

Arithmetic operators such as * don't work with arrays such as ar1 which have a string_ datatype (there are special functions instead - see below). NumPy is just treating the bits in memory as characters and the * operator doesn't make sense here. However, the line

np.array(['avinash','jay'], dtype=object) * 2

有效,因为现在该数组是一个(指向)Python 字符串的数组.* 运算符是为这些 Python 字符串对象定义的.在内存中创建新的 Python 字符串,并返回一个新的 object 数组,其中包含对新字符串的引用.

works because now the array is an array of (pointers to) Python strings. The * operator is well defined for these Python string objects. New Python strings are created in memory and a new object array with references to the new strings is returned.

如果你有一个带有 string_unicode_ dtype 的数组并且想要重复每个字符串,你可以使用 np.char.multiply:

If you have an array with string_ or unicode_ dtype and want to repeat each string, you can use np.char.multiply:

In [52]: np.char.multiply(ar1, 2)
Out[52]: array(['avinashavinash', 'jayjay'], 
      dtype='<U14')

NumPy 还有许多其他矢量化字符串方法.

NumPy has many other vectorised string methods too.

这篇关于创建 numpy 数组时 dtype=object 是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆