替代正则表达式:匹配所有不在引号内的实例 [英] Alternative to regex: match all instances not inside quotes

查看:720
本文介绍了替代正则表达式:匹配所有不在引号内的实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个q / a ,我推断,匹配引号内的给定正则表达式 的所有实例是不可能的。也就是说,它不能匹配转义的引号(例如:这整个\match\应该被采用)。如果有办法做到这一点我不了解,那就解决我的问题。

From this q/a, I deduced that matching all instances of a given regex not inside quotes, is impossible. That is, it can't match escaped quotes (ex: "this whole \"match\" should be taken"). If there is a way to do it that I don't know about, that would solve my problem.

然而,如果没有,我想知道是否有是可以在JavaScript中使用的任何有效的替代方案。我已经想了一下,但是不能在大多数(如果不是全部)情况下都可以使用任何优雅的解决方案。

If not, however, I'd like to know if there is any efficient alternative that could be used in JavaScript. I've thought about it a bit, but can't come with any elegant solutions that would work in most, if not all, cases.

具体来说,我只需要使用.split()和.replace()方法的替代方法,但如果可以更广泛地使用,那将是最好的。

Specifically, I just need the alternative to work with .split() and .replace() methods, but if it could be more generalized, that would be the best.



输入字符串:
+ bar + baznot + or\+或+ \this +foo + bar + / code>

替换+与#,而不是内部的引号,将返回:
#bar#baznot + or\+或+ \\this +foo#bar#

推荐答案

实际上,你可以匹配正则表达式的所有实例没有任何字符串的引号,每个开头报价再次关闭。说,如上例所示,你想要匹配 \ +

Actually, you can match all instances of a regex not inside quotes for any string, where each opening quote is closed again. Say, as in you example above, you want to match \+.

这里的关键观察是,一个字是外面的引号,如果有一个偶数的引号跟随它。这可以被建模为一个先行断言:

The key observation here is, that a word is outside quotes if there are an even number of quotes following it. This can be modeled as a look-ahead assertion:

\+(?=([^"]*"[^"]*")*[^"]*$)

现在,你不想计数转义的引号,这会变得更复杂一些,而不是 [^] * ,它提前到下一个引用,您还需要考虑反斜杠,并使用 [^\\] * 。在您收到反斜杠或引号后,您需要忽略下一个字符,如果您遇到反斜杠,或者进入看起来像(\\。。)([^\\] * \\。)* [^\\] *)。结合起来,你到达

Now, you'd like to not count escaped quotes. This gets a little more complicated. Instead of [^"]* , which advanced to the next quote, you need to consider backslashes as well and use [^"\\]*. After you arrive at either a backslash or a quote, you need to ignore the next character if you encounter a backslash, or else advance to the next unescaped quote. That looks like (\\.|"([^"\\]*\\.)*[^"\\]*"). Combined, you arrive at

\+(?=([^"\\]*(\\.|"([^"\\]*\\.)*[^"\\]*"))*[^"]*$)

我承认这是一个 cryptic。=)

I admit it is a little cryptic. =)

这篇关于替代正则表达式:匹配所有不在引号内的实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆