如何在正则表达式中使用变量? [英] How to use a variable inside a regular expression?

查看:460
本文介绍了如何在正则表达式中使用变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 regex 中使用变量,我该如何在<$ c中执行此操作$ c> Python ?

I'd like to use a variable inside a regex, how can I do this in Python?

TEXTO = sys.argv[1]

if re.search(r"\b(?=\w)TEXTO\b(?!\w)", subject, re.IGNORECASE):
    # Successful match
else:
    # Match attempt failed


推荐答案

从python 3.6也可以使用文字字符串插值, f-strings 。在您的特定情况下,解决方案将是:

From python 3.6 on you can also use Literal String Interpolation, "f-strings". In your particular case the solution would be:

if re.search(rf"\b(?=\w){TEXTO}\b(?!\w)", subject, re.IGNORECASE):
    ...do something

编辑:

由于注释中有一些关于如何处理特殊字符的问题,我想扩展一下答案:

Since there have been some questions in the comment on how to deal with special characters I'd like to extend my answer:

原始字符串('r'):

其中之一在正则表达式中处理特殊字符时,您必须了解的主要概念是区分字符串文字和正则表达式本身。 此处

One of the main concepts you have to understand when dealing with special characters in regular expressions is to distinguish between string literals and the regular expression itself. It is very well explained here:

简而言之:

让我们说而不是找到单词边界 \b TEXTO 之后的$ c>匹配字符串 \boundary 。您必须编写:

Let's say instead of finding a word boundary \b after TEXTO you want to match the string \boundary. The you have to write:

TEXTO = "Var"
subject = r"Var\boundary"

if re.search(rf"\b(?=\w){TEXTO}\\boundary(?!\w)", subject, re.IGNORECASE):
    print("match")

这仅有效,因为我们使用的是原始字符串(正则表达式前面加上 r),否则我们必须在正则表达式中写 \boundary(四个反斜杠)。此外,如果没有'\r',\b'将不再转换为单词边界,而是转换为退格键!

This only works because we are using a raw-string (the regex is preceded by 'r'), otherwise we must write "\\\\boundary" in the regex (four backslashes). Additionally, without '\r', \b' would not converted to a word boundary anymore but to a backspace!

re.escape

基本上在任何特殊字符的前面放置一个空格。因此,如果您希望TEXTO中有特殊字符,则需要编写:

Basically puts a backspace in front of any special character. Hence, if you expect a special character in TEXTO, you need to write:

if re.search(rf"\b(?=\w){re.escape(TEXTO)}\b(?!\w)", subject, re.IGNORECASE):
    print("match")

注意:对于任何版本> = python 3.7: ' / < = > @ `不会被转义,只有正则表达式中具有含义的特殊字符仍会被转义。自Python 3.3起,c> _ 未被转义。(s。此处

NOTE: For any version >= python 3.7: !, ", %, ', ,, /, :, ;, <, =, >, @, and ` are not escaped. Only special characters with meaning in a regex are still escaped. _ is not escaped since Python 3.3.(s. here)

大括号:

如果您想在使用f字符串的正则表达式中使用量词,就必须使用双花括号。假设您要匹配TEXTO后跟精确的2位数字:

If you want to use quantifiers within the regular expression using f-strings, you have to use double curly braces. Let's say you want to match TEXTO followed by exactly 2 digits:

if re.search(rf"\b(?=\w){re.escape(TEXTO)}\d{{2}}\b(?!\w)", subject, re.IGNORECASE):
    print("match")

这篇关于如何在正则表达式中使用变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆