正则表达式来拆分CSV [英] Regex to split a CSV
问题描述
我知道这(或类似)已经被问了很多次,但已经尝试了无数的可能性,我没能找到一个,工程100%的正则表达式。
I know this (or similar) has been asked many times but having tried out numerous possibilities I've not been able to find a a regex that works 100%.
我有一个CSV文件,我想将它分割成数组,但遇到两个问题:引用的逗号和空元素
I've got a CSV file and I'm trying to split it into an array, but encountering two problems: quoted commas and empty elements.
该CSV如下:
123,2.99,AMO024,Title,"Description, more info",,123987564
我试图用正则表达式是:
The regex I've tried to use is:
thisLine.split(/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/)
唯一的问题是,在我的输出数组的第5个元素出来为123987564,而不是一个空字符串。
The only problem is that in my output array the 5th element comes out as 123987564 and not an empty string.
推荐答案
而不是使用一个分裂的,我认为这将是更容易简单地执行比赛,并处理所有发现的匹配。
Description
Instead of using a split, I think it would be easier to simply execute a match and process all the found matches.
这前pression将:
This expression will:
- 在逗号区划界定将您的示例文本
- 将处理空值
- 将忽略双引号逗号,提供了双引号没有嵌套
- 修剪从返回的值 分隔逗号
- 从周围的返回值引号装饰
正则表达式::|(?= [^] |())?(((1)[^(?^,)] * | [^,] *))(=,|?$)
示例文字
123,2.99,AMO024,Title,"Description, more info",,123987564
使用非Java前pression ASP例子
Set regEx = New RegExp
regEx.Global = True
regEx.IgnoreCase = True
regEx.MultiLine = True
sourcestring = "your source string"
regEx.Pattern = "(?:^|,)(?=[^""]|("")?)""?((?(1)[^""]*|[^,""]*))""?(?=,|$)"
Set Matches = regEx.Execute(sourcestring)
For z = 0 to Matches.Count-1
results = results & "Matches(" & z & ") = " & chr(34) & Server.HTMLEncode(Matches(z)) & chr(34) & chr(13)
For zz = 0 to Matches(z).SubMatches.Count-1
results = results & "Matches(" & z & ").SubMatches(" & zz & ") = " & chr(34) & Server.HTMLEncode(Matches(z).SubMatches(zz)) & chr(34) & chr(13)
next
results=Left(results,Len(results)-1) & chr(13)
next
Response.Write "<pre>" & results
匹配使用非Java前pression
0组得到整个字符串,其中包括逗号结果
第1组得到的报价,如果它的使用结果
第2组得到值不包括逗号结果
Group 0 gets the entire substring which includes the comma
Group 1 gets the quote if it's used
Group 2 gets the value not including the comma
[0][0] = 123
[0][1] =
[0][2] = 123
[1][0] = ,2.99
[1][1] =
[1][2] = 2.99
[2][0] = ,AMO024
[2][1] =
[2][2] = AMO024
[3][0] = ,Title
[3][1] =
[3][2] = Title
[4][0] = ,"Description, more info"
[4][1] = "
[4][2] = Description, more info
[5][0] = ,
[5][1] =
[5][2] =
[6][0] = ,123987564
[6][1] =
[6][2] = 123987564
这篇关于正则表达式来拆分CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!