Python正则表达式，用于读取类似CSV的行 [英] Python regex for reading CSV-like rows

查看：155 发布时间：2020/7/11 23:23:16 python regex csv

本文介绍了Python正则表达式，用于读取类似CSV的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想解析传入的类似CSV的数据行.值之间用逗号分隔(逗号前后可能会有前导空格和尾随空格)，并且可以用'或'引起来.例如-这是有效行:

I want to parse incoming CSV-like rows of data. Values are separated with commas (and there could be leading and trailing whitespaces around commas), and can be quoted either with ' or with ". For example - this is a valid row:

    data1, data2  ,"data3'''",  'data4""',,,data5,

但是这个格式不正确:

    data1, data2, da"ta3", 'data4',

-引号只能以空格开头或结尾.

-- quotation marks can only be prepended or trailed by spaces.

应该识别这种格式错误的行-最好以某种方式在行中标记格式错误的值，但是如果正则表达式与整个行不匹配，则它也是可以接受的.

Such malformed rows should be recognized - best would be to somehow mark malformed value within row, but if regex doesn't match the whole row then it's also acceptable.

我正在尝试使用findall()的match()编写能够解析此正则表达式的函数，但是我要使用的每个正则表达式在边缘情况下都存在一些问题.

I'm trying to write regex able to parse this, using either match() of findall(), but every single regex I'm coming with has some problems with edge cases.

因此，也许有解析类似内容的经验的人可以帮助我解决这个问题? (或者这对于正则表达式来说太复杂了，我应该写一个函数)

So, maybe someone with experience in parsing something similar could help me on this? (Or maybe this is too complex for regex and I should just write a function)

csv模块在这里使用不多:

csv module is not much of use here:

    >>> list(csv.reader(StringIO('''2, "dat,a1", 'dat,a2',''')))
    [['2', ' "dat', 'a1"', " 'dat", "a2'", '']]

    >>> list(csv.reader(StringIO('''2,"dat,a1",'dat,a2',''')))
    [['2', 'dat,a1', "'dat", "a2'", '']]

-除非可以调整?

进行了一些语言编辑-我希望现在英语更加有效

A few language edits - I hope it's more valid English now

谢谢您的所有回答，我现在非常确定正则表达式不是一个好主意，因为(1)涵盖所有边缘情况可能很棘手(2)编写器输出不规则.写下来，我决定检查提到的pyparsing并使用它，或编写类似FSM的自定义解析器.

Thank you for all answers, I'm now pretty sure that regular expression is not that good idea here as (1) covering all edge cases can be tricky (2) writer output is not regular. Writing that, I've decided to check mentioned pyparsing and either use it, or write custom FSM-like parser.

Python正则表达式，用于读取类似CSV的行 [英] Python regex for reading CSV-like rows

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python正则表达式，用于读取类似CSV的行 [英] Python regex for reading CSV-like rows

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭