pandas在列中读取csv和额外的逗号 [英] pandas read csv with extra commas in column

查看:2079
本文介绍了pandas在列中读取csv和额外的逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在读一个基本的csv文件,其中列用逗号分隔这些列名:

I'm reading a basic csv file where the columns are separated by commas with these column names:

userid,username,body

但是,body列是一个可以包含逗号的字符串。显然这会导致一个问题,并且pandas会抛出一个错误:

However, the body column is a string which may contain commas. Obviously this causes a problem and pandas throws out an error:

CParserError: C错误:第3行中的第3个字段,看到了8

有没有办法让pandas忽略特定列中的逗号或

Is there a way to tell pandas to ignore commas in a specific column or a way to go around this problem?

推荐答案

想象一下,我们正在读取你的数据框,名为 comma.csv

Imagine we're reading your dataframe called comma.csv:

userid, username, body
01, n1, 'string1, string2'

你可以做的事情是在列中指定字符串的分隔符:

One thing you can do is to specify the delimiter of the strings in the column with:

df = pd.read_csv('comma.csv', quotechar="'")

在这种情况下,由'分隔的字符串被视为总计,无论其中是否有逗号。

In this case strings delimited by ' are considered as total, no matter commas inside them.

这篇关于pandas在列中读取csv和额外的逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆