将逗号分隔的字符串转换为列表,但忽略引号中的逗号 [英] Transform comma separated string into a list but ignore comma in quotes

查看:83
本文介绍了将逗号分隔的字符串转换为列表,但忽略引号中的逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将 1,,2’3,4' 转换为列表?逗号分隔各个项目,除非它们在引号内。在这种情况下,逗号应包含在项目中。

How do I convert "1,,2'3,4'" into a list? Commas separate the individual items, unless they are within quotes. In that case, the comma is to be included in the item.

这是所需的结果: ['1','',' 2','3,4'] 。我在另一个线程上忽略了引号的一个正则表达式如下:

This is the desired result: ['1', '', '2', '3,4']. One regex I found on another thread to ignore the quotes is as follows:

re.compile(r'''((?:[^,"']|"[^"]*"|'[^']*')+)''')

但这给了我以下输出:

['', '1', ',,', "2'3,4'", '']

我不明白,这些多余的空字符串来自何处,以及为什么两个逗号甚至都被打印出来,更不用说一起打印了。

I can't understand, where these extra empty strings are coming from, and why the two commas are even being printed at all, let alone together.

我尝试自己制作此正则表达式:

I tried making this regex myself:

re.compile(r'''(, | "[^"]*" | '[^']*')''')

最终没有检测到任何东西,只是返回了我的原始列表。

which ended up not detecting anything, and just returned my original list.

我不明白为什么,它至少不能检测到逗号吗?如果添加,也会出现相同的问题吗?

I don't understand why, shouldn't it detect the commas at the very least? The same problem occurs if I add a ? after the comma.

推荐答案

代替正则表达式,最好使用 csv 模块,因为您要处理的是CSV字符串:

Instead of a regular expression, you might be better off using the csv module since what you are dealing with is a CSV string:

from cStringIO import StringIO
from csv import reader

file_like_object = StringIO("1,,2,'3,4'")
csv_reader = reader(file_like_object, quotechar="'")
for row in csv_reader:
    print row

这将导致以下输出:

['1', '', '2', '3,4']

这篇关于将逗号分隔的字符串转换为列表,但忽略引号中的逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆