pandas 数据帧read_csv上的不良数据 [英] Pandas dataframe read_csv on bad data

查看：138 发布时间：2020/5/23 21:58:50 python csv pandas

本文介绍了 pandas 数据帧read_csv上的不良数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想读取一个非常大的csv(无法在excel中打开并轻松编辑)，但是在第100,000行的某处，有一行带有一个额外的列，导致程序崩溃.该行是错误的，因此我需要一种方法来忽略它是多余的列这一事实.大约有50列，因此最好不要对标题进行硬编码，而使用名称或usecols是可取的.我也可能会在其他csv中遇到此问题，并希望有一个通用的解决方案.不幸的是，我在read_csv中找不到任何东西.代码很简单:

I want to read in a very large csv (cannot be opened in excel and edited easily) but somewhere around the 100,000th row, there is a row with one extra column causing the program to crash. This row is errored so I need a way to ignore the fact that it was an extra column. There is around 50 columns so hardcoding the headers and using names or usecols isn't preferable. I'll also possibly encounter this issue in other csv's and want a generic solution. I couldn't find anything in read_csv unfortunately. The code is as simple as this:

def loadCSV(filePath):
    dataframe = pd.read_csv(filePath, index_col=False, encoding='iso-8859-1', nrows=1000)
    datakeys = dataframe.keys();
    return dataframe, datakeys

pandas 数据帧read_csv上的不良数据 [英] Pandas dataframe read_csv on bad data

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 数据帧read_csv上的不良数据 [英] Pandas dataframe read_csv on bad data

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭