genfromtxt-强制生成未知列数的列名 [英] genfromtxt - Force column name generation for unknown number of columns
问题描述
我很难让numpy加载表格数据并自动生成列名.看起来很简单,但我无法钉牢.
I have trouble getting numpy to load tabular data and automatically generate column names. It seems pretty simple but I cannot nail it.
如果我知道可以轻松创建names
参数的列数,但是我没有此知识,并且我想避免对数据文件进行事先自省.
If i knew the number of columns I could easily create names
parameter, but I don't have this knowledge, and I would like to avoid prior introspection of the data file.
当我不知道文件中有多少列时,如何强制numpy生成列名或自动使用类似tuple
的dtype
?我想在读取数据后对列名称进行操作.
How can I force numpy to generate the column names, or use tuple
-like dtype
automatically, when I have no knowledge how many columns there are in file? I want to manipulate the column names after reading the data.
到目前为止,我的方法是
My approaches so far:
data = np.genfromtxt(tar_member, unpack = True, names = '')
-我想通过提供一些空"参数来强制自动生成列名.错误为ValueError: size of tuple must match number of fields.
data = np.genfromtxt(tar_member, unpack = True, names = '')
- I wanted to force automatic generation of column names by giving some "empty" parameter. Results with error ValueError: size of tuple must match number of fields.
data = np.genfromtxt(tar_member, unpack = True, names = True)
-有效",但占用了第一行数据.
data = np.genfromtxt(tar_member, unpack = True, names = True)
- "Works" but consumes 1st row of data.
data = np.genfromtxt(tar_member, unpack = True, dtype = None)
-适用于混合类型的数据.自动类型猜测将dtype
扩展为一个元组,并分配了名称.但是,对于实际上一切都为float
的数据,dtype
设置为float64
,当我尝试访问data.dtype.names
时得到了ValueError: there are no fields defined
.
data = np.genfromtxt(tar_member, unpack = True, dtype = None)
- Worked for data with mixed types. Automatic type guessing expanded dtype
into a tuple, and assigned the names. However, for data where everything was actually float
, dtype
was set to float64
, and I got ValueError: there are no fields defined
when I tried accessing data.dtype.names
.
推荐答案
我知道有一种更清洁的方法可以执行此操作,但是如果您不介意强制执行此问题,则可以生成dtype结构并将其直接分配给数据数组.
I know there is a cleaner way to do this, but if you don't mind forcing the issue you can generate your dtype structure and assign it directly to the data array.
x = numpy.random.rand(10,10)
numpy.savetxt('test.out', x, delimiter=',')
dataa = numpy.genfromtxt('test.out',delimiter=",", dtype=None)
if dataa.dtype.names is None:#then dataa is homogenous?
l1 = map(lambda z:('f%d'%(z),dataa.dtype),range(0,dataa.shape[1]))
dataa.dtype = dtype(l1)
dataa.dtype
dataa.dtype.names
这篇关于genfromtxt-强制生成未知列数的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!