正则表达式以匹配所有不在引号内的实例 [英] Regex to match all instances not inside quotes

查看:509
本文介绍了正则表达式以匹配所有不在引号内的实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自此问与答,我推断出在引号内匹配给定正则表达式 not 的所有实例是不可能的。也就是说,它不能匹配转义的引号(例如:应使用整个\ match\ )。如果有我不知道的方法可以解决我的问题。

From this q/a, I deduced that matching all instances of a given regex not inside quotes, is impossible. That is, it can't match escaped quotes (ex: "this whole \"match\" should be taken"). If there is a way to do it that I don't know about, that would solve my problem.

但是,如果没有,我想知道是否有任何有效的选择可以在JavaScript中使用。我已经考虑了一下,但是无法提供任何适用于大多数(即使不是全部)情况的优雅解决方案。

If not, however, I'd like to know if there is any efficient alternative that could be used in JavaScript. I've thought about it a bit, but can't come with any elegant solutions that would work in most, if not all, cases.

具体地说,我只需要替代方法即可工作使用.split()和.replace()方法,但是如果可以更通用的话,那将是最好的方法。

Specifically, I just need the alternative to work with .split() and .replace() methods, but if it could be more generalized, that would be the best.

例如:

输入字符串:
+ bar + baz not + or\ +或+ \ this + foo + bar +

用#代替+,而不用引号引起来,将返回:
#bar#baz not + or\ + or + \; this +" foo#bar#

推荐答案

实际上,您可以匹配正则表达式的所有实例不是在任何字符串的引号内,每个引号都被再次关闭。像上面的示例一样,您要匹配 \ +

Actually, you can match all instances of a regex not inside quotes for any string, where each opening quote is closed again. Say, as in you example above, you want to match \+.

这里的关键观察是,如果单词后面有双引号,则该单词在引号外。可以将其建模为先行断言:

The key observation here is, that a word is outside quotes if there are an even number of quotes following it. This can be modeled as a look-ahead assertion:

\+(?=([^"]*"[^"]*")*[^"]*$)

现在,您不想计算转义的引号,这会变得更加复杂。代替前进到下一个引号的 [^] * 一样,您还需要考虑反斜杠并使用 [^ \\] * 。到达反斜杠或引号后,如果遇到反斜杠,则需要忽略下一个字符,否则前进到下一个未转义的引号。看起来像(\\。|([[^ \\] * \\。)* [^ \\] *)。组合起来,您会到达

Now, you'd like to not count escaped quotes. This gets a little more complicated. Instead of [^"]* , which advanced to the next quote, you need to consider backslashes as well and use [^"\\]*. After you arrive at either a backslash or a quote, you need to ignore the next character if you encounter a backslash, or else advance to the next unescaped quote. That looks like (\\.|"([^"\\]*\\.)*[^"\\]*"). Combined, you arrive at

\+(?=([^"\\]*(\\.|"([^"\\]*\\.)*[^"\\]*"))*[^"]*$)

我承认这是一个隐喻。=)

I admit it is a little cryptic. =)

这篇关于正则表达式以匹配所有不在引号内的实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆