CSV正则表达式 [英] CSV Regular Expression

查看:115
本文介绍了CSV正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我继承了一些使用正则表达式解析CSV格式的数据的代码。它不需要处理空字符串字段之前,但是要求已更改,以便空字符串字段是可能的。

I have inherited some code that uses regular expressions to parse CSV formatted data. It didn't need to cope with empty string fields before now, however the requirements have changed so that empty string fields are a possibility.

我已经更改了正则表达式this:

I have changed the regular expression from this:

new Regex("((?<field>[^\",\\r\\n]+)|\"(?<field>([^\"]|\"\")+)\")(,|(?<rowbreak>\\r\\n|\\n|$))");

到此

new Regex("((?<field>[^\",\\r\\n]*)|\"(?<field>([^\"]|\"\")*)\")(,|(?<rowbreak>\\r\\n|\\n|$))");

(即我已将+更改为*)

(i.e. I have changed the + to *)

问题是,我现在得到一个额外的空字段在末尾,例如ID,Name,Description返回我四个字段:ID,Name,Description和

The problem is that I am now getting an extra empty field at the end, e.g. "ID,Name,Description" returns me four fields: "ID", "Name", "Description" and ""


推荐答案

这一个:

var rx = new Regex("((?<=^|,)(?<field>)(?=,|$)|(?<field>[^\",\\r\\n]+)|\"(?<field>([^\"]|\"\")*)\")(,|(?<rowbreak>\\r\\n|\\n|$))");

我将空白字段的处理移动到第三个或。现在,处理已经工作了(你不需要修改它,它是第二个(?< field>)) 您的代码块),因此您需要处理的是四种情况:

I move the handling of "blank" fields to a third "or". Now, the handling of "" already worked (and you didn't need to modify it, it was the second (?<field>) block of your code), so what you need to handle are four cases:

,
,Id
Id,
Id,,Name

应该这样做:

(?<=^|,)(?<field>)(?=,|$)

一个空字段必须在行的开头之前 ^ 必须为零长度((?< field> 捕获),并且必须后跟或行尾 $

An empty field must be preceeded by the beginning of the row ^ or by a ,, must be of length zero (there isn't anything in the (?<field>) capture) and must be followed by a , or by the end of the line $.

这篇关于CSV正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆