“r"是什么意思?在 python 中 re.compile(r' pattern flags') 是什么意思? [英] What does the "r" in pythons re.compile(r' pattern flags') mean?
问题描述
我正在阅读 http://docs.python.org/2/library/re.html一>.据此,pythons re.compile(r'pattern flags') 中的r"指的是原始字符串表示法:
<块引用>解决方案是使用 Python 的原始字符串表示法来表示正则表达模式;反斜杠没有以任何特殊方式处理以 'r' 为前缀的字符串文字.所以 r"\n" 是一个两个字符的字符串包含 '\' 和 'n',而 "\n" 是一个单字符的字符串包含换行符.通常模式会用 Python 表示使用此原始字符串表示法的代码.
这样说是否公平:
re.compile(r pattern) 表示pattern"是正则表达式,而 re.compile(pattern) 表示pattern"是完全匹配?
正如 @PauloBu
所述,r
字符串前缀与正则表达式无关,而是与字符串相关通常在 Python 中.
普通字符串使用反斜杠字符作为特殊字符(如换行符)的转义字符:
<预><代码>>>>print('这是\n一个测试')这是一个测试r
前缀告诉解释器不要这样做:
这在正则表达式中很重要,因为您需要反斜杠才能使其完整地进入 re
模块 - 特别是,\b
特别在开头匹配空字符串和一个词的结尾.re
需要字符串 \b
,但是正常的字符串解释 '\b'
被转换为 ASCII 退格字符,因此您需要明确地转义反斜杠('\\b'
),或者告诉python它是一个原始字符串(r'\b'
).
I am reading through http://docs.python.org/2/library/re.html. According to this the "r" in pythons re.compile(r' pattern flags') refers the raw string notation :
The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two-character string containing '\' and 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.
Would it be fair to say then that:
re.compile(r pattern) means that "pattern" is a regex while, re.compile(pattern) means that "pattern" is an exact match?
As @PauloBu
stated, the r
string prefix is not specifically related to regex's, but to strings generally in Python.
Normal strings use the backslash character as an escape character for special characters (like newlines):
>>> print('this is \n a test')
this is
a test
The r
prefix tells the interpreter not to do this:
>>> print(r'this is \n a test')
this is \n a test
>>>
This is important in regular expressions, as you need the backslash to make it to the re
module intact - in particular, \b
matches empty string specifically at the start and end of a word. re
expects the string \b
, however normal string interpretation '\b'
is converted to the ASCII backspace character, so you need to either explicitly escape the backslash ('\\b'
), or tell python it is a raw string (r'\b'
).
>>> import re
>>> re.findall('\b', 'test') # the backslash gets consumed by the python string interpreter
[]
>>> re.findall('\\b', 'test') # backslash is explicitly escaped and is passed through to re module
['', '']
>>> re.findall(r'\b', 'test') # often this syntax is easier
['', '']
这篇关于“r"是什么意思?在 python 中 re.compile(r' pattern flags') 是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!