尾部定界符使大 pandas 感到困惑read_csv [英] Trailing delimiter confuses pandas read_csv

查看:85
本文介绍了尾部定界符使大 pandas 感到困惑read_csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个csv(逗号分隔)文件,其中的行带有一个额外的结尾定界符,似乎使pandas.read_csv感到困惑. (数据文件为[1])

A csv (comma delimited) file, where lines have an extra trailing delimiter, seems to confuse pandas.read_csv. (The data file is [1])

它将额外的定界符视为存在额外的列.因此,除了标题所需的内容之外,还有一列.然后pandas.read_csv将第一列作为行标签.总体效果是,列和标题不再对齐-第一列成为行标签,第二列由第一个标题命名,等等.

It treats the extra delimiter as if there's an extra column. So there's one more column than what headers require. Then pandas.read_csv takes the first column as row labels. The overall effect is that columns and headers are not aligned any more - the first column becomes row labels, the second column is named by first header, etc.

这很烦人.知道如何告诉pandas.read_csv做正确的事吗?我找不到一个.

It is quite annoying. Any idea how to tell pandas.read_csv do the right thing? I couldn't find one.

好书,顺便说一句.

[1]:2012年FEC选举数据库,该书来自 Python for Data Analysis

[1]: 2012 FEC Election Database from chapter 9 of the book Python for Data Analysis

推荐答案

我创建了一个GitHub问题来看看如何自动处理此问题:

I created a GitHub issue to have a look at handling this issue automatically:

https://github.com/pydata/pandas/issues/2442

我认为FEC文件格式略有变化,导致此烦人的问题-如果您使用此处发布的文件 http ://github.com/pydata/pydata-book 希望您不会遇到这个问题.

I think the FEC file format changed slightly causing this annoying issue-- if you use the one posted here http://github.com/pydata/pydata-book you hopefully won't have that problem.

这篇关于尾部定界符使大 pandas 感到困惑read_csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆