NumPy genfromtxt:正确使用filling_missing [英] NumPy genfromtxt: using filling_missing correctly

查看:318
本文介绍了NumPy genfromtxt:正确使用filling_missing的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试处理保存到CSV的数据,其中可能包含未知数列(最多30个字符)的缺失值。我试图使用 genfromtxt filling_missing 参数将这些缺失值设置为'0'。这里是一个最小的工作示例numpy 1.6.2运行在ActiveState ActivePython 2.7 32位在Win 7。

I am attempting to process data saved to CSV that may have missing values in an unknown number of columns (up to around 30). I am attempting to set those missing values to '0' using genfromtxt's filling_missing argument. Here is a minimal working example for numpy 1.6.2 running in ActiveState ActivePython 2.7 32 bit on Win 7.

import numpy

text = "a,b,c,d\n1,2,3,4\n5,,7,8"
a = numpy.genfromtxt('test.txt',delimiter=',',names=True)
b = open('test.txt','w')
b.write(text)
b.close()
a = numpy.genfromtxt('test.txt',delimiter=',',names=True)
print "plain",a

a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values=0)
print "filling_values=0",a

a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={1:0})
print "filling_values={1:0}",a

a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={0:0})
print "filling_values={0:0}",a

a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={None:0})
print "filling_values={None:0}",a

结果:

plain [(1.0, 2.0, 3.0, 4.0) (5.0, nan, 7.0, 8.0)]
filling_values=0 [(1.0, 2.0, 3.0, 4.0) (5.0, nan, 7.0, 8.0)]
filling_values={1:0} [(1.0, 2.0, 3.0, 4.0) (5.0, 0.0, 7.0, 8.0)]
filling_values={0:0} [(1.0, 2.0, 3.0, 4.0) (5.0, nan, 7.0, 8.0)]

Traceback (most recent call last):
  File "C:\Users\tolivo.EE\Documents\active\eng\python\sizer\testGenfromtxt.py", line 20, in <module>
    a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={None:0})
  File "C:\Users\tolivo.EE\AppData\Roaming\Python\Python27\site-packages\numpy\lib\npyio.py", line 1451, in genfromtxt
    filling_values[key] = val
TypeError: list indices must be integers, not NoneType

从NumPy用户指南,我期望 filling_values = 0 filling_values = {None:0} 工作,但它们没有,并分别抛出一个错误。当你指定正确的列( filling_values = {1:0} )它会工作,但由于我有大量的未知数字的列之前由用户选择,我正在寻找自动设置填充值的方式,就像用户指南提示。

From the NumPy user guide I would expect filling_values=0 and filling_values={None:0} to work but instead they don't, and throw an error respectively. When you specify the correct column (filling_values={1:0}) it will work, but since I have a large amount of columns of unknown number before selection by the user, I am looking for the way to set the filled values automatically like the user guide hints at.

我想我可以提前计数列,并创建一个字典传递

I imagine I can probably count the columns in advance and create a dict to pass as the value to filling_values in the meantime, but is there a better way?

推荐答案

从文档中并不明显,但是

It's not obvious from the documentation, but filling_values="0" works.

In [19]: !cat test.txt
a,b,c,d
1,2,3,4
5,,7,8
9,10,,12

In [20]: a = numpy.genfromtxt('test.txt', delimiter=',', names=True, filling_values="0")

In [21]: print a
[(1.0, 2.0, 3.0, 4.0) (5.0, 0.0, 7.0, 8.0) (9.0, 10.0, 0.0, 12.0)]

这篇关于NumPy genfromtxt:正确使用filling_missing的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆