“r"是什么意思?在 python 中 re.compile(r' pattern flags') 是什么意思? [英] What does the "r" in pythons re.compile(r' pattern flags') mean?

查看:103
本文介绍了“r"是什么意思?在 python 中 re.compile(r' pattern flags') 是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读 http://docs.python.org/2/library/re.html.据此,pythons re.compile(r'pattern flags') 中的r"指的是原始字符串表示法:

<块引用>

解决方案是使用 Python 的原始字符串表示法来表示正则表达模式;反斜杠没有以任何特殊方式处理以 'r' 为前缀的字符串文字.所以 r"\n" 是一个两个字符的字符串包含 '\' 和 'n',而 "\n" 是一个单字符的字符串包含换行符.通常模式会用 Python 表示使用此原始字符串表示法的代码.

这样说是否公平:

re.compile(r pattern) 表示pattern"是正则表达式,而 re.compile(pattern) 表示pattern"是完全匹配?

解决方案

正如 @PauloBu 所述,r 字符串前缀与正则表达式无关,而是与字符串相关通常在 Python 中.

普通字符串使用反斜杠字符作为特殊字符(如换行符)的转义字符:

<预><代码>>>>print('这是\n一个测试')这是一个测试

r 前缀告诉解释器不要这样做:

<预><代码>>>>打印(r'这是\n一个测试')这是\n一个测试>>>

这在正则表达式中很重要,因为您需要反斜杠才能使其完整地进入 re 模块 - 特别是,\b 特别在开头匹配空字符串和一个词的结尾.re 需要字符串 \b,但是正常的字符串解释 '\b' 被转换为 ASCII 退格字符,因此您需要明确地转义反斜杠('\\b'),或者告诉python它是一个原始字符串(r'\b').

<预><代码>>>>进口重新>>>re.findall('\b', 'test') # 反斜杠被 python 字符串解释器消耗[]>>>re.findall('\\b', 'test') # 反斜杠被显式转义并传递给 re 模块['', '']>>>re.findall(r'\b', 'test') # 通常这种语法更简单['', '']

I am reading through http://docs.python.org/2/library/re.html. According to this the "r" in pythons re.compile(r' pattern flags') refers the raw string notation :

The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two-character string containing '\' and 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.

Would it be fair to say then that:

re.compile(r pattern) means that "pattern" is a regex while, re.compile(pattern) means that "pattern" is an exact match?

解决方案

As @PauloBu stated, the r string prefix is not specifically related to regex's, but to strings generally in Python.

Normal strings use the backslash character as an escape character for special characters (like newlines):

>>> print('this is \n a test')
this is 
 a test

The r prefix tells the interpreter not to do this:

>>> print(r'this is \n a test')
this is \n a test
>>> 

This is important in regular expressions, as you need the backslash to make it to the re module intact - in particular, \b matches empty string specifically at the start and end of a word. re expects the string \b, however normal string interpretation '\b' is converted to the ASCII backspace character, so you need to either explicitly escape the backslash ('\\b'), or tell python it is a raw string (r'\b').

>>> import re
>>> re.findall('\b', 'test') # the backslash gets consumed by the python string interpreter
[]
>>> re.findall('\\b', 'test') # backslash is explicitly escaped and is passed through to re module
['', '']
>>> re.findall(r'\b', 'test') # often this syntax is easier
['', '']

这篇关于“r"是什么意思?在 python 中 re.compile(r' pattern flags') 是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆