转义Python中的所有元字符 [英] Escape all metacharacters in Python

查看:112
本文介绍了转义Python中的所有元字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要搜索可能包含许多元字符的模式.目前,我使用长正则表达式.

I need to search for patterns which may have many metacharacters. Currently I use a long regex.

prodObjMatcher=re.compile(r"""^(?P<nodeName>[\w\/\:\[\]\<\>\@\$]+)""", re.S|re.M|re.I|re.X)

(我的实际模式很长,所以我只粘贴了一些需要帮助的相关部分)

(my actual pattern is very long so I just pasted some relevant portion on which I need help)

当我需要在一次重新编译中编写此类模式的组合时,这尤其痛苦.

This is especially painful when I need to write combinations of such patterns in a single re compilation.

是否有一种缩短样式长度的Python方法?

Is there a pythonic way for shortening the pattern length?

推荐答案

看起来,您的模式可以简化为

Look, your pattern can be reduced to

r"""^(?P<nodeName>[]\w/:[<>@$]+).*?"""

请注意,除了速记类^-]\之外,您不必在字符类中转义任何非单词字符.有一些方法可以使字符类中的那些字符(\除外)保持不转义:

Note that you do not have to ever escape any non-word character in the character classes, except for shorthand classes, ^, -, ], and \. There are ways to keep even those (except for \) unescaped in the character class:

    字符类开头的
  • ]
  • -在字符类的开头/结尾
  • ^-仅当将其作为文字符号放置在字符类的开头时才应转义.
  • ] at the start of the character class
  • - at the start/end of the character class
  • ^ - should only be escaped if you place it at the start of the character class as a literal symbol.

在字符类之外,必须转义\[()+$^*?.

Outside a character class, you must escape \, [, (, ), +, $, ^, *, ?, ..

请注意,/在Python regex模式中不是特殊的regex元字符,并且不必转义.

Note that / is not a special regex metacharacter in Python regex patterns, and does not have to be escaped.

在定义正则表达式模式时使用原始字符串文字,以避免出现问题(例如混淆单词边界r'\b'和退格键'\b').

Use raw string literals when defining your regex patterns to avoid issues (like confusing word boundary r'\b' and a backspace '\b').

这篇关于转义Python中的所有元字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆