使用Pandas的Python:如何忽略“"中的定界符? [英] Python using pandas: how to Ignore delimiter within ""?
问题描述
我的CSV文件包含一个包含16列的标题.数据行包含以,"分隔的16个值.
My CSV files contains a header with 16 columns. The data lines contains 16 values separated with ",".
仅发现某些行包含""
中包含,
的值.这使解析器感到困惑.而不是期望15个逗号,而是找到18个.下面是一个示例:
Just found that some lines contains values within ""
that contains ,
. This is confusing the parser. Instead of expecting 15 commas, it finds 18. One example below:
"23210","Cosmetic","Lancome","Eyes Virtuose Palette Makeup","**7,2g**","W","Decorative range","5x**1,2**g Eye Shadow + **1,2**g Powder","http://image.jpg","","3660732000104","","No","","1","1"
如何使解析器忽略""
中的逗号?
How can make the parser ignore the comma sign within ""
?
我的代码如下:
import pandas
csv1 = pandas.read_csv('Produktlista.csv', quoting=3)
csv2 = pandas.read_csv('Prislista.csv', quoting= 3)
merged = csv1.merge(csv2, on='id')
merged.to_csv("output.csv", index=False, quoting=3)
推荐答案
传递参数quotechar='"'
.摘自 Pandas文档:
quotechar :str(长度1),可选
quotechar : str (length 1), optional
用于表示引用项目的开始和结束的字符.引用的项目可以包含定界符,它将被忽略.
The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.
例如:
In [9]:
t='''"23210","Cosmetic","Lancome","Eyes Virtuose Palette Makeup","7,2g","W","Decorative range","5x1,2g Eye Shadow + 1,2g Powder","http://image.jpg","","3660732000104","","No","","1","1"'''
df = pd.read_csv(io.StringIO(t), quotechar='"', header=None)
df
Out[9]:
0 1 2 3 4 5 \
0 23210 Cosmetic Lancome Eyes Virtuose Palette Makeup 7,2g W
6 7 8 9 \
0 Decorative range 5x1,2g Eye Shadow + 1,2g Powder http://image.jpg NaN
10 11 12 13 14 15
0 3660732000104 NaN No NaN 1 1
这篇关于使用Pandas的Python:如何忽略“"中的定界符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!