Python:numpy.genfromtxt-需要包含无效字符的列名 [英] Python: numpy.genfromtxt - Need column names that contain invalid characters
问题描述
我正在使用numpy.genfromtxt
导入CSV文件.
I am working on importing CSV files with numpy.genfromtxt
.
要导入的数据具有列名的头,并且其中某些列名包含genfromtxt
认为无效的字符.具体来说,其中一些名称包含#"和".输入数据无法更改,因为它是由我无法控制的其他来源生成的.
The data to be imported has a header of column names, and some of those column names contain characters that genfromtxt
considers invalid. Specifically, some of the names contain "#" and " ". The input data cannot be changed as it is generated by other sources that I do not control.
使用names=True
和comments=None
,我无法输入我需要的所有列名称.
Using names=True
and comments=None
, I am unable to bring in all of the column names that I need.
我尝试覆盖numpy.lib.NameValidator.deletechars=None
,但这不会影响实际使用的NameValidator类实例.
I've tried overriding numpy.lib.NameValidator.deletechars=None
, but this does not affect the NameValidator class instance that is actually in use.
我知道deletechars
存在是由于Recarray可能像访问一个属性一样访问一个字段.但是,即使读入时字符被剥离,我也必须能够读入包含无效字符的列名.
I understand that deletechars
exists due to the recarray potential to access a field as if it were an attribute. However, I simply must be able to read in column names that include invalid characters, even if the characters are stripped off when read in.
是否有一种方法可以强制NameValidator
不检查无效字符,或修改其检查的字符?我无法修改numpy/lib/_iotools.py,因为我不是root用户,因此修改共享安装很不好.
Is there a way to force the NameValidator
to not check for invalid characters, or to modify the characters it checks for? I am unable to modify numpy/lib/_iotools.py as I am not root and it would be bad to modify a shared installation.
推荐答案
您没有明确声明numpy.genfromtxt是一项硬性要求,所以让我建议您尝试
You do not explicitly state that numpy.genfromtxt is a hard requirement, so let me suggest that you try asciitable.
此模块可以在解析之前替换某些条目: http://cxc.harvard.edu/contrib/asciitable/#replace-bad-or-missing-values
This module has a way to replace certain entries before parsing: http://cxc.harvard.edu/contrib/asciitable/#replace-bad-or-missing-values
您还可以根据现有的阅读器定义自己的阅读器: http ://cxc.harvard.edu/contrib/asciitable/#advanced-table-reading
And you can also define your own readers based on the existing ones: http://cxc.harvard.edu/contrib/asciitable/#advanced-table-reading
asciitable reader的输出是numpy数组,因此您应该能够将当前使用的功能或多或少直接替换为asciitable.
The output of asciitable reader are numpy arrays, so you should be able to replace the functions you currently use more or less directly with asciitable.
这篇关于Python:numpy.genfromtxt-需要包含无效字符的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!