使用numpy导入数据时如何保留列名? [英] How to preserve column names while importing data using numpy?

查看：213 发布时间：2020/5/18 19:11:29 python numpy

本文介绍了使用numpy导入数据时如何保留列名?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在Python中使用numpy库将CSV文件数据导入到ndarray中，如下所示:

I am using the numpy library in Python to import CSV file data into a ndarray as follows:

data = np.genfromtxt('mydata.csv', 
                     delimiter='\,', dtype=None, names=True)

结果提供以下列名称:

print(data.dtype.names)

('row_label',
 'MyDataColumn1_0',
 'MyDataColumn1_1')

原始列名称为:

row_label, My-Data-Column-1.0, My-Data-Column-1.1

看来，NumPy强迫我的列名采用C样式的变量名格式.但是在很多情况下，我的Python脚本需要根据列名访问列，因此我需要确保列名保持不变.为此，NumPy需要保留原始列名，否则我需要将列名转换为NumPy使用的格式.

It appears that NumPy is forcing my column names to adopt C-style variable name formatting. Yet there are many cases where my Python scripts require access to columns according to column name, so I need to ensure that column names remain constant. To accomplish this either NumPy needs to preserve the original column names or else I need to convert my column names to the format NumPy is using.

是否有一种在导入过程中保留原始列名的方法?

Is there a way to preserve the original column names during import?

如果没有，是否有一种简单的方法可以将列标签转换为使用NumPy正在使用的格式，最好使用某些NumPy函数?

If not, is there an easy way to convert column labels to use the format NumPy is using, preferably using some NumPy function?

推荐答案

如果设置了names=True，则数据文件的第一行将通过此函数传递:

if you set names=True, then the first line of your data file is passed through this function:

validate_names = NameValidator(excludelist=excludelist,
                               deletechars=deletechars,
                               case_sensitive=case_sensitive,
                               replace_space=replace_space)

这些是您可以提供的选项:

These are those options that you can supply:

excludelist : sequence, optional
    A list of names to exclude. This list is appended to the default list
    ['return','file','print']. Excluded names are appended an underscore:
    for example, `file` would become `file_`.
deletechars : str, optional
    A string combining invalid characters that must be deleted from the
    names.
defaultfmt : str, optional
    A format used to define default field names, such as "f%i" or "f_%02i".
autostrip : bool, optional
    Whether to automatically strip white spaces from the variables.
replace_space : char, optional
    Character(s) used in replacement of white spaces in the variables
    names. By default, use a '_'.

也许您可以尝试提供自己的deletechars字符串，它是一个空字符串.但是最好还是修改并传递此参数:

Perhaps you could try to supply your own deletechars string that is an empty string. But you'd be better off modifying and passing this:

defaultdeletechars = set("""~!@#$%^&*()-=+~\|]}[{';: /?.>,<""")

只需从该集合中取出句点和减号，然后将其传递为:

Just take out the period and minus sign from that set, and pass it as:

np.genfromtxt(..., names=True, deletechars="""~!@#$%^&*()=+~\|]}[{';: /?>,<""")

以下是来源: https://github.com/numpy/numpy/blob /master/numpy/lib/_iotools.py#l245

这篇关于使用numpy导入数据时如何保留列名?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用numpy导入数据时如何保留列名? [英] How to preserve column names while importing data using numpy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用numpy导入数据时如何保留列名? [英] How to preserve column names while importing data using numpy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭