如何在正则表达式中使用变量? [英] How to use a variable inside a regular expression?
问题描述
我想在 regex
中使用变量
,我该如何在<$ c中执行此操作$ c> Python ?
I'd like to use a variable
inside a regex
, how can I do this in Python
?
TEXTO = sys.argv[1]
if re.search(r"\b(?=\w)TEXTO\b(?!\w)", subject, re.IGNORECASE):
# Successful match
else:
# Match attempt failed
推荐答案
从python 3.6也可以使用文字字符串插值, f-strings 。在您的特定情况下,解决方案将是:
From python 3.6 on you can also use Literal String Interpolation, "f-strings". In your particular case the solution would be:
if re.search(rf"\b(?=\w){TEXTO}\b(?!\w)", subject, re.IGNORECASE):
...do something
编辑:
由于注释中有一些关于如何处理特殊字符的问题,我想扩展一下答案:
Since there have been some questions in the comment on how to deal with special characters I'd like to extend my answer:
原始字符串('r'):
其中之一在正则表达式中处理特殊字符时,您必须了解的主要概念是区分字符串文字和正则表达式本身。 此处:
One of the main concepts you have to understand when dealing with special characters in regular expressions is to distinguish between string literals and the regular expression itself. It is very well explained here:
简而言之:
让我们说而不是找到单词边界 \b $ c您想在
TEXTO
之后的$ c>匹配字符串 \boundary
。您必须编写:
Let's say instead of finding a word boundary \b
after TEXTO
you want to match the string \boundary
. The you have to write:
TEXTO = "Var"
subject = r"Var\boundary"
if re.search(rf"\b(?=\w){TEXTO}\\boundary(?!\w)", subject, re.IGNORECASE):
print("match")
这仅有效,因为我们使用的是原始字符串(正则表达式前面加上 r),否则我们必须在正则表达式中写 \boundary(四个反斜杠)。此外,如果没有'\r',\b'将不再转换为单词边界,而是转换为退格键!
This only works because we are using a raw-string (the regex is preceded by 'r'), otherwise we must write "\\\\boundary" in the regex (four backslashes). Additionally, without '\r', \b' would not converted to a word boundary anymore but to a backspace!
re.escape :
基本上在任何特殊字符的前面放置一个空格。因此,如果您希望TEXTO中有特殊字符,则需要编写:
Basically puts a backspace in front of any special character. Hence, if you expect a special character in TEXTO, you need to write:
if re.search(rf"\b(?=\w){re.escape(TEXTO)}\b(?!\w)", subject, re.IGNORECASE):
print("match")
注意:对于任何版本> = python 3.7:!
, ,
%
,',
未被转义。(s。此处),
, /
,:
,;
,<
, =
,>
, @
和`
不会被转义,只有正则表达式中具有含义的特殊字符仍会被转义。自Python 3.3起,c> _
NOTE: For any version >= python 3.7: !
, "
, %
, '
, ,
, /
, :
, ;
, <
, =
, >
, @
, and `
are not escaped. Only special characters with meaning in a regex are still escaped. _
is not escaped since Python 3.3.(s. here)
大括号:
如果您想在使用f字符串的正则表达式中使用量词,就必须使用双花括号。假设您要匹配TEXTO后跟精确的2位数字:
If you want to use quantifiers within the regular expression using f-strings, you have to use double curly braces. Let's say you want to match TEXTO followed by exactly 2 digits:
if re.search(rf"\b(?=\w){re.escape(TEXTO)}\d{{2}}\b(?!\w)", subject, re.IGNORECASE):
print("match")
这篇关于如何在正则表达式中使用变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!