Python:类型和dtype之间的混淆 [英] Python: confusion between types and dtypes

查看:355
本文介绍了Python:类型和dtype之间的混淆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我输入:

a = uint8(200)
a*2

然后结果为400,并将其重铸为uint16类型.

Then the result is 400, and it is recast to be of type uint16.

但是:

a = array([200],dtype=uint8)
a*2

结果是

array([144], dtype=uint8)

已对256模进行乘法,以确保结果保留在一个字节中.

The multiplication has been performed modulo 256, to ensure that the result stays in one byte.

我对"types"和"dtypes"感到困惑,并且其中一个优先于另一个使用.如您所见,类型可能会在输出中产生很大的差异.

I'm confused about "types" and "dtypes" and where one is used in preference to another. And as you see, the type may make a significant difference in the output.

例如,我可以创建一个单个的dtype uint8数字,以便对该数字进行256模运算吗?或者,我可以创建uint8类型(而非dtype)的数组,以便对其进行操作将产生0-255范围之外的值吗?

Can I, for example, create a single number of dtype uint8, so that operations on that number will be performed modulo 256? Alternatively, can I create an array of type (not dtype) uint8 so that operations on it will produce values outside the range 0-255?

推荐答案

NumPy数组的typenumpy.ndarray;这只是它的Python对象类型(例如,类似于type("hello")str的方式).

The type of a NumPy array is numpy.ndarray; this is just the type of Python object it is (similar to how type("hello") is str for example).

dtype仅定义如何用标量(即单个数字)或数组解释内存中的字节以及如何处理字节(例如int/float).因此,您不必更改数组或标量的type,而只需更改其dtype.

dtype just defines how bytes in memory will be interpreted by a scalar (i.e. a single number) or an array and the way in which the bytes will be treated (e.g. int/float). For that reason you don't change the type of an array or scalar, just its dtype.

如您所见,如果将两个标量相乘,则结果数据类型是可以将两个值都转换为的最小安全"类型.但是,将一个数组与一个标量相乘只会返回一个具有相同数据类型的数组.函数np.inspect_types文档清楚地说明了特定标量或数组对象的dtype已更改:

As you observe, if you multiply two scalars, the resulting datatype is the smallest "safe" type to which both values can be cast. However, multiplying an array and a scalar will simply return an array of the same datatype. The documentation for the function np.inspect_types is clear about when a particular scalar or array object's dtype is changed:

NumPy中的类型提升与C ++等语言中的规则类似,但有一些细微差别.当同时使用标量和数组时,数组的类型优先,并且标量的实际值也要考虑在内.

Type promotion in NumPy works similarly to the rules in languages like C++, with some slight differences. When both scalars and arrays are used, the array's type takes precedence and the actual value of the scalar is taken into account.

文档继续:

如果仅存在标量,或者标量的最大类别高于数组的最大类别,则将数据类型与promote_types组合以生成返回值.

If there are only scalars or the maximum category of the scalars is higher than the maximum category of the arrays, the data types are combined with promote_types to produce the return value.

因此对于np.uint8(200) * 2,两个标量,结果数据类型将是

So for np.uint8(200) * 2, two scalars, the resulting datatype will be the type returned by np.promote_types:

>>> np.promote_types(np.uint8, int)
dtype('int32')

对于np.array([200], dtype=np.uint8) * 2,数组的数据类型优先于标量int,并返回np.uint8数据类型.

For np.array([200], dtype=np.uint8) * 2 the array's datatype takes precedence over the scalar int and a np.uint8 datatype is returned.

要解决有关在操作过程中保留标量的dtype的最终问题,您必须限制用于避免NumPy自动进行dtype升级的任何其他标量的数据类型:

To address your final question about retaining the dtype of a scalar during operations, you'll have to restrict the datatypes of any other scalars you use to avoid NumPy's automatic dtype promotion:

>>> np.array([200], dtype=np.uint8) * np.uint8(2)
144

当然,另一种选择是简单地将单个值包装在NumPy数组中(然后NumPy不会在标量不同的dtype的操作中将其强制转换).

The alternative, of course, is to simply wrap the single value in a NumPy array (and then NumPy won't cast it in operations with scalars of different dtype).

要在操作过程中提升数组的类型,可以先将所有标量包装在数组中:

To promote the type of an array during an operation, you could wrap any scalars in an array first:

>>> np.array([200], dtype=np.uint8) * np.array([2])
array([400])

这篇关于Python:类型和dtype之间的混淆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆