pandas read_csv（）有条件地跳过标题行 [英] Pandas read_csv() conditionally skipping header row

查看：332 发布时间：2020/10/12 20:39:36 python pandas csv

本文介绍了 pandas read_csv（）有条件地跳过标题行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试读取 csv 文件，但是我的csv文件有所不同。有些格式不同，有些则其他。我正在尝试添加控件，以便无需编辑代码或输入文件。

I'm trying to read a csv file but my csv files differ. Some have different format and some have other. I'm trying to add controls so that I will not need to edit my code or my input file.

我的问题是，其中某些csv文件在列标题上方有一行String。示例：

My problem is, some of these csv files have a line of String above the column headers. An example:

Created on 12-11-2018,CryptoDataDownload.com
Date,Symbol,Open,High,Low,Close,Volume From,Volume To
2018-12-11 11-AM,ADABTC,8.6e-06,8.61e-06,8.55e-06,8.57e-06,301141.7,2.59
2018-12-11 10-AM,ADABTC,8.69e-06,8.72e-06,8.6e-06,8.6e-06,236949.63,2.05

如果导入此文件，则分隔符将使用第一行并将文件分成两列，如创建于2018年11月11日和 CryptoDataDownload.com 。

If I import this, the delimeter will use the first line and separate the file into two columns as Created on 12-11-2018 and CryptoDataDownload.com.

这是 df.head（）的样子：

                        Created on 12-11-2018 CryptoDataDownload.com
Date             Symbol Open     High     Low      Close              Volume From                          Volume To
2018-12-11 11-AM ADABTC 8.6e-06  8.61e-06 8.55e-06 8.57e-06              301141.7                               2.59
2018-12-11 10-AM ADABTC 8.69e-06 8.72e-06 8.6e-06  8.6e-06              236949.63                               2.05
2018-12-11 09-AM ADABTC 8.7e-06  8.7e-06  8.62e-06 8.69e-06             509311.39                               4.41
2018-12-11 08-AM ADABTC 8.69e-06 8.7e-06  8.63e-06 8.7e-06              111367.34                             0.9656

我要检查此文件是否具有此行如果是，请跳过它。

I want to check if this file has this line and skip it if so.

我该怎么做？

推荐答案

如果CSV文件中的标头遵循类似的模式，则可以执行一些简单的操作，例如先确定第一行，然后确定是否跳过第一行。

If the headers in your CSV files follow a similar pattern, you can do something simple like sniffing out the first line before determining whether to skip the first row or not.

filename = '/path/to/file.csv'
skiprows = int('Created in' in next(open(filename)))
df = pd.read_csv(filename, skiprows=skiprows)

好习惯是使用上下文管理器，因此您也可以这样做：

Good pratice would be to use a context manager, so you could also do this:

filename = '/path/to/file.csv'
skiprows = 0
with open(filename, 'r+') as f:
    for line in f:
        if line.startswith('Created '):
            skiprows = 1
        break
df = pd.read_csv(filename, skiprows=skiprows)

这篇关于 pandas read_csv（）有条件地跳过标题行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas read_csv（）有条件地跳过标题行 [英] Pandas read_csv() conditionally skipping header row

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas read_csv（）有条件地跳过标题行 [英] Pandas read_csv() conditionally skipping header row

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭