为什么 Numpy 中的 0d 数组不被视为标量? [英] Why are 0d arrays in Numpy not considered scalar?

查看:32
本文介绍了为什么 Numpy 中的 0d 数组不被视为标量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当然,0d 数组是标量,但 Numpy 似乎并不这么认为......我是否遗漏了什么或者我只是误解了这个概念?

<预><代码>>>>foo = numpy.array(1.11111111111, numpy.float64)>>>numpy.ndim(foo)0>>>numpy.isscalar(foo)错误的>>>foo.item()1.11111111111

解决方案

不要想太多.这最终对个人的心理健康和长寿有益.

Numpy 标量类型的奇怪情况是因为没有优雅且一致的方法将 1x1 矩阵降级为标量类型.尽管在数学上它们是相同的东西,但它们由非常不同的代码处理.

如果您一直在编写任何数量的科学代码,最终您会希望 max(a) 之类的东西可以处理各种大小的矩阵,甚至是标量.从数学上讲,这是完全合理的预期.然而对于程序员来说,这意味着无论在 Numpy 中出现什么标量都应该具有 .shape 和 .ndim 属性,因此至少 ufunc 不必对其输入进行显式类型检查,以获取 Numpy 中 21 种可能的标量类型.

另一方面,它们还应该与确实对标量类型进行显式类型检查的现有 Python 库一起使用.这是一个难题,因为当 Numpy ndarray 被简化为标量时,它们必须单独更改其类型,并且如果不检查所有访问,就无法知道是否发生了这种情况.实际上,按照标量类型标准,走这条路可能会慢得有点可笑.

Numpy 开发者的解决方案是从 ndarray 和 Python 标量继承它自己的标量类型,这样所有标量也有 .shape、.ndim、.T 等.1x1 矩阵仍然存在,但它的如果您知道要处理标量,则不鼓励使用.虽然理论上这应该可以正常工作,但有时您仍然可以看到他们用油漆滚筒遗漏的一些地方,并且丑陋的内脏暴露给所有人:

<预><代码>>>>从 numpy 导入 *>>>a = 数组(1)>>>b = int_(1)>>>a.ndim0>>>b.ndim0>>>一种[...]数组(1)>>>一种[()]1>>>乙[...]数组(1)>>>b[()]1

a[...]a[()] 真的没有理由返回不同的东西,但确实如此.有一些建议可以改变这种情况,但看起来他们忘记完成 1x1 阵列的工作.

一个潜在的更大且可能无法解决的问题是 Numpy 标量是不可变的.因此,将一个标量喷射"到一个 ndarray 中,在数学上是将一个数组折叠成一个标量的伴随操作,是一个要实现的 PITA.你实际上不能增长一个 Numpy 标量,根据定义它不能被转换成一个 ndarray,即使 newaxis 神秘地在它上面工作:

<预><代码>>>>b[0,1,2,3] = 1回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中类型错误:'numpy.int32' 对象不支持项目分配>>>b[新轴]数组([1])

在 Matlab 中,增加标量的大小是一种完全可以接受的无脑操作.在 Numpy 中,你必须在任何你认为的地方都使用不和谐的 a = array(a) 你有可能从一个标量开始到一个数组结束.我理解为什么 Numpy 必须以这种方式与 Python 搭配得很好,但这并不能改变许多新切换器对此深感困惑的事实.有些人清楚地记得与这种行为作斗争并最终坚持下来,而其他人则走得太远了,通常会留下一些深深的无形的心理疤痕,经常萦绕在他们最无辜的梦想中.这对所有人来说都是一个糟糕的情况.

Surely a 0d array is scalar, but Numpy does not seem to think so... am I missing something or am I just misunderstanding the concept?

>>> foo = numpy.array(1.11111111111, numpy.float64)
>>> numpy.ndim(foo)
0
>>> numpy.isscalar(foo)
False
>>> foo.item()
1.11111111111

解决方案

One should not think too hard about it. It's ultimately better for the mental health and longevity of the individual.

The curious situation with Numpy scalar-types was bore out of the fact that there is no graceful and consistent way to degrade the 1x1 matrix to scalar types. Even though mathematically they are the same thing, they are handled by very different code.

If you've been doing any amount of scientific code, ultimately you'd want things like max(a) to work on matrices of all sizes, even scalars. Mathematically, this is a perfectly sensible thing to expect. However for programmers this means that whatever presents scalars in Numpy should have the .shape and .ndim attirbute, so at least the ufuncs don't have to do explicit type checking on its input for the 21 possible scalar types in Numpy.

On the other hand, they should also work with existing Python libraries that does do explicit type-checks on scalar type. This is a dilemma, since a Numpy ndarray have to individually change its type when they've been reduced to a scalar, and there is no way of knowing whether that has occurred without it having do checks on all access. Actually going that route would probably make bit ridiculously slow to work with by scalar type standards.

The Numpy developer's solution is to inherit from both ndarray and Python scalars for its own scalary type, so that all scalars also have .shape, .ndim, .T, etc etc. The 1x1 matrix will still be there, but its use will be discouraged if you know you'll be dealing with a scalar. While this should work fine in theory, occasionally you could still see some places where they missed with the paint roller, and the ugly innards are exposed for all to see:

>>> from numpy import *
>>> a = array(1)
>>> b = int_(1)
>>> a.ndim
0
>>> b.ndim
0
>>> a[...]
array(1)
>>> a[()]
1
>>> b[...]
array(1)
>>> b[()]
1

There's really no reason why a[...] and a[()] should return different things, but it does. There are proposals in place to change this, but looks like they forgot to finish the job for 1x1 arrays.

A potentially bigger, and possibly non-resolvable issue, is the fact that Numpy scalars are immutable. Therefore "spraying" a scalar into a ndarray, mathematically the adjoint operation of collapsing an array into a scalar, is a PITA to implement. You can't actually grow a Numpy scalar, it cannot by definition be cast into an ndarray, even though newaxis mysteriously works on it:

>>> b[0,1,2,3] = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'numpy.int32' object does not support item assignment
>>> b[newaxis]
array([1])

In Matlab, growing the size of a scalar is a perfectly acceptable and brainless operation. In Numpy you have to stick jarring a = array(a) everywhere you think you'd have the possibility of starting with a scalar and ending up with an array. I understand why Numpy has to be this way to play nice with Python, but that doesn't change the fact that many new switchers are deeply confused about this. Some have explicit memory of struggling with this behaviour and eventually persevering, while others who are too far gone are generally left with some deep shapeless mental scar that frequently haunts their most innocent dreams. It's an ugly situation for all.

这篇关于为什么 Numpy 中的 0d 数组不被视为标量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆