正则表达式:匹配不在引号之间的单词 [英] Regular expression: match word not between quotes

查看:99
本文介绍了正则表达式:匹配不在引号之间的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想要一个 Python 正则表达式来匹配不在简单引号之间的给定单词.我尝试使用 (?! ...) 但没有成功.

在下面的屏幕截图中,我想匹配除第 4 行之外的所有 foe.

另外,文本是作为一个大字符串给出的.

这是regex101的链接,示例文本如下:

var foe = 10;敌人 = "";dark_vador = '坏人'敌人 = '我是你的父亲,敌人!'酒吧 = 东西 + 敌人

解决方案

下面的正则表达式解决方案在大多数情况下都有效,但如果不平衡的单引号出现在字符串文字之外,例如在评论中.

在上下文中匹配字符串的常用正则表达式技巧是匹配您需要替换和匹配的内容,并捕获您需要保留的内容.

这是一个示例 Python 演示:

导入重新rx = r"('[^'\\]*(?:\\.[^'\\]*)*')|\b{0}\b"s = r"""var 敌人 = 10;敌人 = "";dark_vador = '坏人'敌人 = '我是你的父亲,敌人!'酒吧 = 东西 + 敌人"""toReplace = "敌人"res = re.sub(rx.format(toReplace), lambda m: m.group(1) if m.group(1) else 'NEWORD', s)打印(资源)

查看 Python 演示

正则表达式看起来像

('[^'\\]*(?:\\.[^'\\]*)*')|\bfoe\b

查看正则表达式演示.

('[^'\\]*(?:\\.[^'\\]*)*') 部分将单引号字符串文字捕获到组 1 中,如果它匹配,它只是放回结果中,并且 \bfoe\b 在任何其他字符串上下文中匹配整个单词 foe - 随后被另一个单词替换.>

注意:要同时匹配双引号字符串文字,请使用 r"('[^'\\]*(?:\\.[^'\\]*)*'|\"[^\"\\]*(?:\\.[^\"\\]*)*\")".

I would like a Python regular expression that matches a given word that's not between simple quotes. I've tried to use the (?! ...) but without success.

In the following screenshot, I would like to match all foe except the one in the 4th line.

Plus, the text is given as one big string.

Here is the link regex101 and the sample text is below:

var foe = 10;
foe = "";
dark_vador = 'bad guy'
foe = ' I\'m your father, foe ! '
bar = thingy + foe

解决方案

A regex solution below will work in most cases, but it might break if the unbalanced single quotes appear outside of string literals, e.g. in comments.

A usual regex trick to match strings in-context is matching what you need to replace and match and capture what you need to keep.

Here is a sample Python demo:

import re
rx = r"('[^'\\]*(?:\\.[^'\\]*)*')|\b{0}\b"
s = r"""
    var foe = 10;
    foe = "";
    dark_vador = 'bad guy'
    foe = ' I\'m your father, foe ! '
    bar = thingy + foe"""
toReplace = "foe"
res = re.sub(rx.format(toReplace), lambda m: m.group(1) if m.group(1) else 'NEWORD', s)
print(res)

See the Python demo

The regex will look like

('[^'\\]*(?:\\.[^'\\]*)*')|\bfoe\b

See the regex demo.

The ('[^'\\]*(?:\\.[^'\\]*)*') part captures ingle-quoted string literals into Group 1 and if it matches, it is just put back into the result, and \bfoe\b matches whole words foe in any other string context - and subsequently is replaced with another word.

NOTE: To also match double quoted string literals, use r"('[^'\\]*(?:\\.[^'\\]*)*'|\"[^\"\\]*(?:\\.[^\"\\]*)*\")".

这篇关于正则表达式:匹配不在引号之间的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆