为什么Python CSV阅读器会忽略双引号字段? [英] Why is the Python CSV reader ignoring double-quoted fields?

查看:243
本文介绍了为什么Python CSV阅读器会忽略双引号字段?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为这可能很简单,但是经过一个小时的搜索,我没有运气找出我做错了什么.

I think this is probably something simple, but after an hour of searching, I've had no luck figuring out what I'm doing wrong.

我正在使用以下代码读取CSV文件-读取文件没有问题,但是当一行包含双引号的字段(因为其中包含定界符)时,CSV阅读器将忽略双引号并将该字段解析为2个单独的字段.

I'm using the following code to read a CSV file - I have no problem reading the file, but when a line contains a field that is double-quoted because it contains the delimiter, the CSV reader ignores the double-quotes and parses the field into 2 separate fields.

这是我正在使用的代码:

Here's the code I'm using:

myReader = csv.reader(open(inPath, 'r'), dialect='excel', delimiter=',', quotechar='"')
for row in myReader:
    print row,
    print len(row)

我的输入:

hello, this is row 1, foo1
hello, this is row 2, foo2
goodbye, "this, is row 3", foo3

哪个给我:

['hello', ' this is row 1', ' foo1'] 3
['hello', ' this is row 2', ' foo2'] 3
['goodbye', ' "this', ' is row 3"', ' foo3'] 4

我需要更改什么才能将双引号字段识别为一个字段? 我正在使用python 2.6.1版.

What do I need to change so it will recognize the double-quoted field as one field? I'm using python version 2.6.1.

谢谢!

推荐答案

如果查看所使用的方言,您会注意到excel方言是 配置如下:

If you look at the dialect that you're using, you'll notice that the excel dialect is configured as follows:

class excel(Dialect):
    """Describe the usual properties of Excel-generated CSV files."""
    delimiter = ','
    quotechar = '"'
    doublequote = True
    skipinitialspace = False
    lineterminator = '\r\n'
    quoting = QUOTE_MINIMAL

请注意,skipinitialspace设置为False.只需将其传递给您的读者即可. 哦,顺便说一句,您传递的所有字段在 使用excel方言,这是传递给 csv.reader <的默认方言参数/a>

Notice that skipinitialspace is set to False. Just pass that into your reader. Oh and by the way, all the fields you've passed in are already the defaults when using the excel dialect, which is the default dialect parameter passed to csv.reader

因此,我将像这样重新编写您的代码:

So, I would re-write your code like so:

>>> with open(inPath) as fp:
>>>     reader = csv.reader(fp, skipinitialspace=True)
>>>     for row in reader:
>>>         print row,
>>>         print len(row)
['hello', 'this is row 1', 'foo1'] 3
['hello', 'this is row 2', 'foo2'] 3
['goodbye', 'this, is row 3', 'foo3'] 3

这篇关于为什么Python CSV阅读器会忽略双引号字段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆