为什么Numpy中的0d数组不视为标量? [英] Why are 0d arrays in Numpy not considered scalar?

查看:194
本文介绍了为什么Numpy中的0d数组不视为标量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当然0d数组是标量的,但是Numpy似乎不这么认为...我是否缺少某些内容?还是我只是误解了这个概念?

>>> foo = numpy.array(1.11111111111, numpy.float64)
>>> numpy.ndim(foo)
0
>>> numpy.isscalar(foo)
False
>>> foo.item()
1.11111111111

解决方案

一个人不应该为之苦思.最终对个人的心理健康和长寿更好.

使用Numpy标量类型的情况令人好奇,是因为没有一种优雅而一致的方法可以将1x1矩阵降级为标量类型.即使从数学上讲它们是同一回事,但它们是通过非常不同的代码来处理的.

如果您一直在进行大量科学代码,那么最终您会希望像max(a)这样的东西可以在所有大小甚至标量的矩阵上工作.从数学上讲,这是一件非常明智的事情.但是,对于程序员而言,这意味着Numpy中出现标量的任何内容都应具有.shape和.ndim样式,因此至少ufunc不必对Numpy中21种可能的标量类型的输入进行显式类型检查.

另一方面,它们还应该与现有的Python库一起使用,这些库可以对标量类型进行显式类型检查.这是一个难题,因为在将Numpy ndarray简化为标量后,它们必须单独更改其类型,并且如果不对所有访问进行检查,就无法知道是否发生了这种情况.实际上,按照标量类型标准使用该路线可能会使工作变得有点荒谬.

Numpy开发人员的解决方案是从ndarray和Python标量继承其自己的标度类型,以便所有标量也具有.shape,.ndim,.T等.1x1矩阵仍将存在,但其如果您知道要处理标量,则不建议使用.虽然从理论上讲应该可以正常工作,但偶尔您仍然可以看到一些他们用油漆滚筒错过的地方,而且丑陋的内脏暴露在外,所有人可以看到:

>>> from numpy import *
>>> a = array(1)
>>> b = int_(1)
>>> a.ndim
0
>>> b.ndim
0
>>> a[...]
array(1)
>>> a[()]
1
>>> b[...]
array(1)
>>> b[()]
1

a[...]a[()]确实没有理由应该返回不同的东西,但是确实如此.有一些建议可以改变这一点,但是看起来他们忘记了完成1x1阵列的工作.

一个潜在的更大且可能无法解决的问题是Numpy标量是不可变的.因此,从数学上将标量喷"到ndarray是将数组折叠成标量的伴随操作,是要实现的PITA.您实际上无法增长Numpy标量,即使定义newaxis对其神秘地起作用,它也不能定义为ndarray:

>>> b[0,1,2,3] = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'numpy.int32' object does not support item assignment
>>> b[newaxis]
array([1])

在Matlab中,增大标量的大小是完全可以接受且毫无头脑的操作.在Numpy中,您必须在思考的任何地方坚持使用a = array(a),您将有可能从标量开始并以数组结尾.我理解为什么Numpy必须采用这种方式才能与Python完美兼容,但这并不能改变许多新的切换器对此深感困惑的事实.有些人清楚地记忆着这种行为,并最终坚持不懈,而另一些人则太遥远了,他们通常会留下一些深深的,无形的精神创伤,常常困扰着他们最纯真的梦想.这对所有人来说都是个丑陋的情况.

Surely a 0d array is scalar, but Numpy does not seem to think so... am I missing something or am I just misunderstanding the concept?

>>> foo = numpy.array(1.11111111111, numpy.float64)
>>> numpy.ndim(foo)
0
>>> numpy.isscalar(foo)
False
>>> foo.item()
1.11111111111

解决方案

One should not think too hard about it. It's ultimately better for the mental health and longevity of the individual.

The curious situation with Numpy scalar-types was bore out of the fact that there is no graceful and consistent way to degrade the 1x1 matrix to scalar types. Even though mathematically they are the same thing, they are handled by very different code.

If you've been doing any amount of scientific code, ultimately you'd want things like max(a) to work on matrices of all sizes, even scalars. Mathematically, this is a perfectly sensible thing to expect. However for programmers this means that whatever presents scalars in Numpy should have the .shape and .ndim attirbute, so at least the ufuncs don't have to do explicit type checking on its input for the 21 possible scalar types in Numpy.

On the other hand, they should also work with existing Python libraries that does do explicit type-checks on scalar type. This is a dilemma, since a Numpy ndarray have to individually change its type when they've been reduced to a scalar, and there is no way of knowing whether that has occurred without it having do checks on all access. Actually going that route would probably make bit ridiculously slow to work with by scalar type standards.

The Numpy developer's solution is to inherit from both ndarray and Python scalars for its own scalary type, so that all scalars also have .shape, .ndim, .T, etc etc. The 1x1 matrix will still be there, but its use will be discouraged if you know you'll be dealing with a scalar. While this should work fine in theory, occasionally you could still see some places where they missed with the paint roller, and the ugly innards are exposed for all to see:

>>> from numpy import *
>>> a = array(1)
>>> b = int_(1)
>>> a.ndim
0
>>> b.ndim
0
>>> a[...]
array(1)
>>> a[()]
1
>>> b[...]
array(1)
>>> b[()]
1

There's really no reason why a[...] and a[()] should return different things, but it does. There are proposals in place to change this, but looks like they forgot to finish the job for 1x1 arrays.

A potentially bigger, and possibly non-resolvable issue, is the fact that Numpy scalars are immutable. Therefore "spraying" a scalar into a ndarray, mathematically the adjoint operation of collapsing an array into a scalar, is a PITA to implement. You can't actually grow a Numpy scalar, it cannot by definition be cast into an ndarray, even though newaxis mysteriously works on it:

>>> b[0,1,2,3] = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'numpy.int32' object does not support item assignment
>>> b[newaxis]
array([1])

In Matlab, growing the size of a scalar is a perfectly acceptable and brainless operation. In Numpy you have to stick jarring a = array(a) everywhere you think you'd have the possibility of starting with a scalar and ending up with an array. I understand why Numpy has to be this way to play nice with Python, but that doesn't change the fact that many new switchers are deeply confused about this. Some have explicit memory of struggling with this behaviour and eventually persevering, while others who are too far gone are generally left with some deep shapeless mental scar that frequently haunts their most innocent dreams. It's an ugly situation for all.

这篇关于为什么Numpy中的0d数组不视为标量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆