numpy的:理解为行名numpy的阵列概念 [英] Numpy: understanding the numpy array concept for row names

查看:317
本文介绍了numpy的:理解为行名numpy的阵列概念的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

也许一个非常模糊的问题,但挖numpy的链接并没有帮助我。

Maybe a very vague question, but digging the links on numpy did not help me.

我需要做一个相似矩阵计算有以下二进制数组看起来像这样的分层聚类

I need to do a similarity matrix calculation with following hierarchial clustering for binary array that look like this

name    val1    val2    val3    val4    val5
comp1   0   0   1   0   1
comp2   1   0   0   0   0
comp3   0   0   1   0   0
comp4   1   1   0   0   0
comp5   0   0   1   0   0

我不明白,行名的numpy的概念。我可以读这样的文件

I don't understand the concept of row names in numpy. I can read the file like this

test = np.genfromtxt('test.b', delimiter='\t', names = True, dtype = None)
print type(test[0])
numpy.void
print test[0]
('comp1',0, 0, 1, 0, 1)

但如何兼顾行名称(这个信息是非常重要的)?这可能吗?

But how to take into account the row names (this info is very important)? Is it possible?

我想这虚空不存储进一步相似矩阵计算二进制数组的正确方法?

I suppose that the void is not a correct way of storing a binary array for further similarity matrix calculation?

推荐答案

numpy的真的不支持行名。它支持的列名,通过结构阵列的。你可以使用类似 DTYPE = [('名',对象),('VAL1',INT),...] 。这也可以通过读取该文件的第一行中,说不定自动化

Numpy doesn't really support row names. It does support column names, through structured arrays. You could use something like dtype=[('name', object), ('val1', int), ...]. That could also be automated by reading the first line of the file, maybe.

什么 genfromtxt 是给你的仅仅是类型的数组对象,其中一列恰好包含字符串和别人会包含整数 - 但所有的人都低效存储为Python对象,而不是有效的格式

What genfromtxt is giving you is simply an array of type object, where one column happens to contain strings and the others happen to contain integers – but all of them are stored inefficiently as Python objects, rather than in efficient formats.

您可能感兴趣的大熊猫,它扩展的支持numpy的矩阵的行标有(其中包括许多其他的东西)。 <一href=\"http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_table.html?highlight=read_table#pandas.read_table\"相对=nofollow> pandas.read_table 将处理您的文件很好。

You may be interested in pandas, which extends numpy matrices with support for labeled rows (among many other things). pandas.read_table will handle your file nicely.

这篇关于numpy的:理解为行名numpy的阵列概念的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆