numpy的结构数组名称和指标 [英] Numpy Structured Arrays by Name AND Index

查看：236 发布时间：2016/6/3 10:08:11 python arrays numpy

本文介绍了numpy的结构数组名称和指标的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我似乎从来没有numpy的阵列来对我很好地工作。（

I can never seem to get NumPy arrays to work nicely for me. :(

我的数据很简单：150行4彩车后面跟着一个字符串。我试过如下：

My dataset is simple: 150 rows of 4 floats followed by one string. I tried the following:

data = np.genfromtxt("iris.data2", delimiter=",", names=["SL", "SW", "PL", "PW", "class"], dtype=[float, float, float, float, '|S16'])

print(data.shape) ---> (150, 0)
print(data["PL"])
print(data[:, 0:3]) <---error

所以，我做一个简单的文件替换改变了它只有5浮动。我只能这样做，因为我无法获得非均匀阵列既列名和索引访问很好地工作。但现在，我已均质，它仍然给我回的形状（150，0）和一个错误。

So I changed it just 5 floats by doing a simple file replace. I only do this because I couldn't get the non-homogenous array to work nicely with both column name and index accessing. But now that I have made it homogenous, it still gives me back a shape of (150, 0) and an error.

data = np.genfromtxt("iris.data", delimiter=",", names=["SL", "SW", "PL", "PW", "class"])

print(data.shape) ---> (150, 0)
print(data["PL"])
print(data[:, 0:3]) <--- error

当我完全删除名称，它为索引列的存取权限，但显然不是名字了。

When I remove the names entirely, it works for index-column acces, but obviously not names anymore.

data = np.genfromtxt("iris.data", delimiter=",")

print(data.shape) ---> (150, 5)
# print(data["PL"])
print(data[:, 0:3]) ---> WORKS GREAT!!!

这是为什么？如何解决？理想情况下，我想没有一个引脚悬空code替换字符串既名称和索引列访问，但如果我需要为了得到名称和索引列访问我会做到这一点。

Why is this and how do I fix it? Ideally I would like both name and index column access without replacing the string with a float-code, but I will do it if I need to in order to get name and index column access.

推荐答案

有一个一维数组结构化的领域，二维数组的列之间有明显的区别。它们是不可互换。字段名是不是简单的列标签。如果说不清楚你很多需要阅读 DTYPE 或结构阵列文档的更多细节。

There's a clear distinction between the fields of a 1d structured array, and the columns of a 2d array. They aren't interchangeable. Field names aren't simply column labels. If that isn't clear you many need to read the dtype or structured array docs in more detail.

定义一个伪文件：

In [93]: txt=b"""1,2,3,4,txt
   ....: 5,6,7,8,abc"""

In [94]: np.genfromtxt(txt.splitlines(),delimiter=',',dtype=None)
Out[94]: 
array([(1, 2, 3, 4, 'txt'), (5, 6, 7, 8, 'abc')], 
      dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4'), ('f3', '<i4'), ('f4', 'S3')])

通过混合列的默认方式加载它是一个结构数组，2行（形状=（2，）），以及5个字段，索引为数据['F0'] 或数据['F0'，'F2'] 。能力指数几个领域一次是有限的。

With mixed columns the default way to load it is a structured array, with 2 rows (shape=(2,)), and 5 fields, indexed as data['f0'] or data[['f0','f2']]. The ability to index several fields at once is limited.

但是，我们可以定义一个复合DTYPE，如：

But we can define a compound dtype, such as:

In [102]: dt=np.dtype([('data',float,(4,)),('lbl','|S5')])

In [103]: dt
Out[103]: dtype([('data', '<f8', (4,)), ('lbl', 'S5')])

In [104]: np.genfromtxt(txt.splitlines(),delimiter=',',dtype=dt)
Out[104]: 
array([([1.0, 2.0, 3.0, 4.0], 'txt'), ([5.0, 6.0, 7.0, 8.0], 'abc')], 
      dtype=[('data', '<f8', (4,)), ('lbl', 'S5')])

In [105]: data=np.genfromtxt(txt.splitlines(),delimiter=',',dtype=dt)

In [106]: data['data']
Out[106]: 
array([[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.]])

In [107]: data['lbl']
Out[107]: 
array(['txt', 'abc'], 
      dtype='|S5')

In [108]: data[0]
Out[108]: ([1.0, 2.0, 3.0, 4.0], 'txt')

现在数据['数据'] 是一个二维数组，从原来的文本包含数值。

Now data['data'] is a 2d array, containing the numeric values from the original text.

字段名称可以牵强，因为一个元组：

The field names can be fetched as a tuple:

In [112]: data.dtype.names
Out[112]: ('data', 'lbl')

这样就可以对它们执行通常的列表/元组索引，甚至做一些令人费解的观看顺序相反的字段：

so it is possible to perform usual list/tuple indexing on them, and even do something a convoluted as viewing the fields in reverse order:

In [115]: data[list(data.dtype.names[::-1])]
Out[115]: 
array([('txt', [1.0, 2.0, 3.0, 4.0]), ('abc', [5.0, 6.0, 7.0, 8.0])], 
      dtype=[('lbl', 'S5'), ('data', '<f8', (4,))])

这篇关于numpy的结构数组名称和指标的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

numpy的结构数组名称和指标 [英] Numpy Structured Arrays by Name AND Index

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

numpy的结构数组名称和指标 [英] Numpy Structured Arrays by Name AND Index

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭