正则表达式匹配所有不在引号内的实例 [英] Regex to match all instances not inside quotes

查看:39
本文介绍了正则表达式匹配所有不在引号内的实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自这个q/a,我推断在引号内匹配给定正则表达式的所有实例 not 是不可能的.也就是说,它不能匹配转义引号(例如:"这整个 "match" 应该被采用").如果有一种我不知道的方法可以解决我的问题.

From this q/a, I deduced that matching all instances of a given regex not inside quotes, is impossible. That is, it can't match escaped quotes (ex: "this whole "match" should be taken"). If there is a way to do it that I don't know about, that would solve my problem.

但是,如果没有,我想知道是否有任何有效的替代方法可以在 JavaScript 中使用.我考虑了一下,但无法提出任何适用于大多数(如果不是全部)情况的优雅解决方案.

If not, however, I'd like to know if there is any efficient alternative that could be used in JavaScript. I've thought about it a bit, but can't come with any elegant solutions that would work in most, if not all, cases.

具体来说,我只需要使用 .split() 和 .replace() 方法的替代方法,但如果它可以更通用,那将是最好的.

Specifically, I just need the alternative to work with .split() and .replace() methods, but if it could be more generalized, that would be the best.

例如:
输入字符串:
+bar+baz"not+or"+or+"this+"foo+bar+
将 + 替换为 #,而不是在引号内,将返回:
#bar#baz"not+or"+or+"this+"foo#bar#

推荐答案

实际上,您可以匹配正则表达式的所有实例,而不是在任何字符串的引号内,其中每个左引号再次关闭.说,就像上面的例子一样,你想匹配 +.

Actually, you can match all instances of a regex not inside quotes for any string, where each opening quote is closed again. Say, as in you example above, you want to match +.

这里的关键观察是,如果一个单词后面有偶数个引号,则该单词在引号之外.这可以建模为前瞻断言:

The key observation here is, that a word is outside quotes if there are an even number of quotes following it. This can be modeled as a look-ahead assertion:

+(?=([^"]*"[^"]*")*[^"]*$)

现在,您不想计算转义引号.这变得有点复杂.您还需要考虑反斜杠并使用 [^"\]*,而不是前进到下一个引号的 [^"]* .到达反斜杠或引号后,如果遇到反斜杠,则需要忽略下一个字符,否则前进到下一个未转义的引号.看起来像 (\.|"([^"\]*\.)*[^"\]*").结合起来,你到达

Now, you'd like to not count escaped quotes. This gets a little more complicated. Instead of [^"]* , which advanced to the next quote, you need to consider backslashes as well and use [^"\]*. After you arrive at either a backslash or a quote, you need to ignore the next character if you encounter a backslash, or else advance to the next unescaped quote. That looks like (\.|"([^"\]*\.)*[^"\]*"). Combined, you arrive at

+(?=([^"\]*(\.|"([^"\]*\.)*[^"\]*"))*[^"]*$)

我承认它有点神秘.=)

这篇关于正则表达式匹配所有不在引号内的实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆