正则表达式来拆分CSV [英] Regex to split a CSV

查看:576
本文介绍了正则表达式来拆分CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这(或类似)已经被问了很多次,但已经尝试了无数的可能性,我没能找到一个,工程100%的正则表达式。

I know this (or similar) has been asked many times but having tried out numerous possibilities I've not been able to find a a regex that works 100%.

我有一个CSV文件,我想将它分割成数组,但遇到两个问题:引用的逗号和空元素

I've got a CSV file and I'm trying to split it into an array, but encountering two problems: quoted commas and empty elements.

该CSV如下:

123,2.99,AMO024,Title,"Description, more info",,123987564

我试图用正则表达式是:

The regex I've tried to use is:

thisLine.split(/,(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/)

唯一的问题是,在我的输出数组的第5个元素出来为123987564,而不是一个空字符串。

The only problem is that in my output array the 5th element comes out as 123987564 and not an empty string.

推荐答案

而不是使用一个分裂的,我认为这将是更容易简单地执行比赛,并处理所有​​发现的匹配。

Description

Instead of using a split, I think it would be easier to simply execute a match and process all the found matches.

这前pression将:

This expression will:


  • 在逗号区划界定将您的示例文本

  • 将处理空值

  • 将忽略双引号逗号,提供了双引号没有嵌套

  • 修剪从返回的值
  • 分隔逗号
  • 从周围的返回值引号装饰

正则表达式::|(?= [^] |())?(((1)[^(?^,)] * | [^,] *))(=,|?$)

示例文字

123,2.99,AMO024,Title,"Description, more info",,123987564

使用非Java前pression ASP例子

Set regEx = New RegExp
regEx.Global = True
regEx.IgnoreCase = True
regEx.MultiLine = True
sourcestring = "your source string"
regEx.Pattern = "(?:^|,)(?=[^""]|("")?)""?((?(1)[^""]*|[^,""]*))""?(?=,|$)"
Set Matches = regEx.Execute(sourcestring)
  For z = 0 to Matches.Count-1
    results = results & "Matches(" & z & ") = " & chr(34) & Server.HTMLEncode(Matches(z)) & chr(34) & chr(13)
    For zz = 0 to Matches(z).SubMatches.Count-1
      results = results & "Matches(" & z & ").SubMatches(" & zz & ") = " & chr(34) & Server.HTMLEncode(Matches(z).SubMatches(zz)) & chr(34) & chr(13)
    next
    results=Left(results,Len(results)-1) & chr(13)
  next
Response.Write "<pre>" & results

匹配使用非Java前pression

0组得到整个字符串,其中包括逗号结果
第1组得到的报价,如果它的使用结果
第2组得到值不包括逗号结果

Group 0 gets the entire substring which includes the comma
Group 1 gets the quote if it's used
Group 2 gets the value not including the comma

[0][0] = 123
[0][1] = 
[0][2] = 123

[1][0] = ,2.99
[1][1] = 
[1][2] = 2.99

[2][0] = ,AMO024
[2][1] = 
[2][2] = AMO024

[3][0] = ,Title
[3][1] = 
[3][2] = Title

[4][0] = ,"Description, more info"
[4][1] = "
[4][2] = Description, more info

[5][0] = ,
[5][1] = 
[5][2] = 

[6][0] = ,123987564
[6][1] = 
[6][2] = 123987564

这篇关于正则表达式来拆分CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆