了解非均匀的numpy数组 [英] Understanding non-homogeneous numpy arrays

查看:113
本文介绍了了解非均匀的numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始感到麻木,并注意到了一件奇怪的事情.

I have recently started numpy and noticed a peculiar thing.

import numpy as np
a = np.array([[1,2,3], [4,5,9, 8]])
print a.shape, "shape"
print a[1, 0]

在这种情况下,形状变为2L.但是,如果我将均匀的numpy数组设为 a = np.array([[1,2,3], [4,5,6]],则a.shape给出(2L, 3L).我知道,非均匀数组的形状很难表示为元组.

The shape, in this case, comes out to be 2L. However if I make a homogenous numpy array as a = np.array([[1,2,3], [4,5,6]], then a.shape gives (2L, 3L). I understand that the shape of a non-homogenous array is difficult to represent as a tuple.

此外,我先前创建的非均质数组的print a[1,0]提供了回溯IndexError: too many indices for array.在同质数组上执行相同操作,将返回正确的元素4.

Additionally, print a[1,0] for non-homogenous array that I created earlier gives a traceback IndexError: too many indices for array. Doing the same on the homogenous array gives back the correct element 4.

考虑到这两个特性,我很好奇知道python如何在低水平看待非均匀的numpy数组. 预先谢谢你

Noticing these two peculiarities, I am curious to know how python looks at non-homogenous numpy arrays at a low level. Thank You in advance

推荐答案

当子列表的长度不同时,np.array将退回到创建object dtype数组:

When the sublists differ in length, np.array falls back to creating an object dtype array:

In [272]: a = np.array([[1,2,3], [4,5,9, 8]])
In [273]: a
Out[273]: array([[1, 2, 3], [4, 5, 9, 8]], dtype=object)

此数组与我们开始的列表相似.两者都将子列表存储为指针.子列表存在于内存中的其他位置.

This array is similar to the list we started with. Both store the sublists as pointers. The sublists exist else where in memory.

具有相等长度的sublsts,它可以创建带有整数元素的2d数组:

With equal length sublsts, it can create a 2d array, with integer elements:

In [274]: a2 = np.array([[1,2,3], [4,5,9]])
In [275]: a2
Out[275]: 
array([[1, 2, 3],
       [4, 5, 9]])

实际上,为了确认我关于子列表存储在内存中其他位置的主张,让我们尝试更改其中一个:

In fact to confirm my claim that the sublists are stored elsewhere in memory, let's try to change one:

In [276]: alist = [[1,2,3], [4,5,9, 8]]
In [277]: a = np.array(alist)
In [278]: a
Out[278]: array([[1, 2, 3], [4, 5, 9, 8]], dtype=object)
In [279]: a[0].append(4)
In [280]: a
Out[280]: array([[1, 2, 3, 4], [4, 5, 9, 8]], dtype=object)
In [281]: alist
Out[281]: [[1, 2, 3, 4], [4, 5, 9, 8]]

a2情况下不起作用. a2具有自己的数据存储,与源列表无关.

That would not work in the case of a2. a2 has its own data storage, independent of the source list.

基本要点是np.array尝试在可能的情况下创建n-d数组.如果不能,则返回创建对象dtype数组.并且,正如在其他问题中所讨论的那样,有时会引发错误.故意创建对象数组也很棘手.

The basic point is that np.array tries to create an n-d array where possible. If it can't it falls back on to creating an object dtype array. And, as has been discussed in other questions, it sometimes raises an error. It is also tricky to intentionally create an object array.

a的形状很容易(2,).单个元素元组. a是一维数组.但是该形状不能传达有关a元素的信息.对于alist的元素也是如此. len(alist)为2.对象数组可以具有更复杂的形状,例如a.reshape(1,2,1),但它仍然只包含指针

The shape of a is easy, (2,). A single element tuple. a is a 1d array. But that shape does not convey information about the elements of a. And the same goes for the elements of alist. len(alist) is 2. An object array can have a more complex shape, e.g. a.reshape(1,2,1), but it is still just contains pointers

a包含2个4字节指针; a2包含6个4字节整数.

a contains 2 4byte pointers; a2 contains 6 4byte integers.

n [282]: a.itemsize
Out[282]: 4
In [283]: a.nbytes
Out[283]: 8
In [284]: a2.nbytes
Out[284]: 24
In [285]: a2.itemsize
Out[285]: 4

这篇关于了解非均匀的numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆