re.DOTALL 和 re.MULTILINE 有什么区别? [英] What's the difference between re.DOTALL and re.MULTILINE?

查看:25
本文介绍了re.DOTALL 和 re.MULTILINE 有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在多行匹配一个表达式时,我总是使用 re.DOTALL 并且它工作正常.现在我偶然发现了 re.MULTILINE 字符串,看起来它在做同样的事情.

来自 re 模块(没有说得更清楚,但值不同):

M = MULTILINE = sre_compile.SRE_FLAG_MULTILINE # 使锚点寻找换行符S = DOTALL = sre_compile.SRE_FLAG_DOTALL # 使点匹配换行符SRE_FLAG_MULTILINE = 8 # 将目标视为多行字符串SRE_FLAG_DOTALL = 16 # 将目标视为单个字符串

那么用法上有什么不同吗,它可能返回不同的东西的微妙情况是什么?

解决方案

它们完全不同.是的,两者都会影响换行符的处理方式,但它们会针对不同的概念切换行为.

  • re.MULTILINE 影响 ^$ anchors 匹配的位置.

    如果没有开关,^$ 分别只匹配整个文本的开头和结尾.使用 switch,它们也会在换行符之前或之后匹配:

    <预><代码>>>>进口重新>>>re.search('foo$', 'foo\nbar') is None # 不匹配真的>>>re.search('foo$', 'foo\nbar', flags=re.MULTILINE)<_sre.SRE_Match 对象;span=(0, 3), match='foo'>

  • re.DOTALL 影响 . pattern 可以匹配的内容.

    没有开关,. 匹配任何字符除了换行符.使用开关,换行符也匹配:

    <预><代码>>>>re.search('foo.', 'foo\nbar') is None # 不匹配真的>>>re.search('foo.', 'foo\nbar', flags=re.DOTALL)<_sre.SRE_Match 对象;span=(0, 4), match='foo\n'>

When matching an expression on multiple lines, I always used re.DOTALL and it worked OK. Now I stumbled across the re.MULTILINE string, and it looks like it's doing the same thing.

From the re module (doesn't make it clearer, but the values are different):

M = MULTILINE = sre_compile.SRE_FLAG_MULTILINE # make anchors look for newline
S = DOTALL = sre_compile.SRE_FLAG_DOTALL # make dot match newline

SRE_FLAG_MULTILINE = 8 # treat target as multiline string
SRE_FLAG_DOTALL = 16 # treat target as a single string

So is there a difference in the usage, and what is the subtle cases where it could return something different?

解决方案

They are quite different. Yes, both affect how newlines are treated, but they switch behaviour for different concepts.

  • re.MULTILINE affects where ^ and $ anchors match.

    Without the switch, ^ and $ match only at the start and end, respectively, of the whole text. With the switch, they also match just before or after a newline:

    >>> import re
    >>> re.search('foo$', 'foo\nbar') is None  # no match
    True
    >>> re.search('foo$', 'foo\nbar', flags=re.MULTILINE)
    <_sre.SRE_Match object; span=(0, 3), match='foo'>
    

  • re.DOTALL affects what the . pattern can match.

    Without the switch, . matches any character except a newline. With the switch, newlines are matched as well:

    >>> re.search('foo.', 'foo\nbar') is None  # no match
    True
    >>> re.search('foo.', 'foo\nbar', flags=re.DOTALL)
    <_sre.SRE_Match object; span=(0, 4), match='foo\n'>
    

这篇关于re.DOTALL 和 re.MULTILINE 有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆