javascript / regex忽略双引号中的分号 [英] javascript/regex to ignore semicolons in double quotes

查看:68
本文介绍了javascript / regex忽略双引号中的分号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对这一点感到难过 - 我有一个字符串几乎是一个分号分隔的字符串,它将是这样的:

I've been stumped for bit on this one - I have a string that is almost a semicolon delimited string it would be something like this:


一个;二;三四;五;六;七

one; two; three "four; five;six"; seven

我想将javascript中的正则表达式拆分成这样的数组(例如,忽略双引号内的任何分号) ):

I'd like to split this up using a regex in javascript into an array like this (e.g. ignoring any semicolons inside double quotes):


['one','two','threefour; five; six','seven']

['one','two','three "four; five;six"','seven']

我尝试调整已知的工作CSV函数,但它们似乎能够适应第三个元素('三个')四日;五类;六个一;')。

I've tried adapting known working CSV functions, but they seem to be able to be adapted to work with the third element ('three "four;five;six";').

这似乎是一个正则表达式的问题,但如果使用的不仅仅是正则表达式,我当然感兴趣!

It seems like a regex type of problem, but if a solution exists using more than regex, I'm certainly interested!

更新:我还应该注意,在带引号的字符串中分号之前或之后可能有空格。我已经更新了示例以反映这一点。

update: I should also note that there may be spaces before or after the semicolons in the quoted string. I've updated the example to reflect that.

推荐答案

假设您不允许在引号内使用转义引号(例如这有\转义引号\内)然后这应该有效:

Assuming you don't allow for escaped quotes inside your quotes (e.g. "this has \"escaped quotes\" inside") then this should work:

var rx = /(?!;|$)[^;"]*(("[^"]*")[^;"]*)*/g;
var str = 'one; two; three "four;five;six"; seven';
var res = str.match(rx)
// res = ['one', ' two', ' three "four;five;six"', ' seven']

注意你需要否定前瞻 (?!; | $)在正则表达式的开头,以防止它匹配空字符串,否则匹配方法由于某种原因匹配每个分号前面的空字符串。

Note that you need the negative-lookahead (?!;|$) at the beginning of the regex to keep it from matching the empty string, otherwise the match method matches empty strings in front of each of the semicolons for some reason.

更新:

Update:

我认为这个正则表达式应该可以使用逃脱了报价l(虽然我很欣赏有关正确性的反馈)。我还在负前瞻模式中添加了额外的 \s ,以在前一个分号后删除空格。

I think this regular expression should work with escaped quotes as well (although I'd appreciate feedback on the correctness). I've also added the extra \s in the negative-lookahead pattern to strip off whitespace after the preceding semicolon.

/(?!\s|;|$)[^;"]*("(\\.|[^\\"])*"[^;"]*)*/g

这篇关于javascript / regex忽略双引号中的分号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆