Python 正则表达式模式 * 未按预期工作 [英] Python regular expression pattern is not working as expected*

查看：44 发布时间：2021/6/25 19:55:38 python regex

本文介绍了Python 正则表达式模式 * 未按预期工作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在学习 Google 2010 Python 课程时，我发现了以下文档:

<块引用>

'*' -- 左边模式出现 0 次或多次

但是当我尝试以下操作时

re.search(r'i*','biiiiiiiiiiiiiig').group()

我希望 'iiiiiiiiiiiiiii' 作为输出但得到 ''.为什么?

解决方案

* 表示 0 或更多，但 re.search 只会返回第一个匹配项.这里的第一个匹配项是一个空字符串.所以你得到一个空字符串作为输出.

将 * 更改为 + 以获得所需的输出.

<预><代码>>>>re.search(r'i*','biiiiiiiiiiiiiig').group()''>>>re.search(r'i+','biiiiiiiiiiiiiig').group()'iiiiiiiiiiiiiii'

考虑这个例子.

<预><代码>>>>re.search(r'i*','biiiiiiiiiiiiiig').group()''>>>re.search(r'i*','iiiiiiiiiiiiiig').group()'iiiiiiiiiiiiiii'

这里 i* 返回 iiiiiiiiiiiiiii 因为一开始，正则表达式引擎尝试匹配 i 的零次或多次.一旦它在第一个找到 i，它就会贪婪地匹配第二个例子中的所有 i，所以你得到 iiiiiiiii 作为输出如果 i 不是第一个(考虑这个 biiiiiiig 字符串)，i* 模式将匹配所有每个不匹配之前的空字符串，在我们的例子中，它匹配 b 和 g 之前存在的所有空字符串.因为 re.search 只返回第一个匹配项，你应该得到一个空字符串，因为第一个 b 不匹配.

为什么我在下面的例子中得到三个空字符串作为输出?

<预><代码>>>>re.findall(r'i*','biiiiiiiiiiiiiig')['', 'iiiiiiiiiiiiiii', '', '']

正如我之前解释过的，对于每个不匹配，你应该得到一个空字符串作为匹配.让我解释.正则表达式引擎从左到右解析输入.

作为输出的第一个空字符串是因为模式 i* 不匹配字符 b 但它匹配存在于 b.
现在引擎移动到下一个字符 i 它将被我们的模式 i* 匹配，所以它贪婪地匹配下面的 我的 .所以你得到 iiiiiiiiiiiiiii 作为第二个.
在匹配完所有的 i 之后，它会移动到下一个字符 g，它与我们的模式 i* 不匹配 .所以 i* 匹配非匹配 g 之前的空字符串.这就是第三个空字符串的原因.
现在我们的模式 i* 匹配存在于行尾之前的空字符串.这就是第四个空字符串的原因.

While working through Google's 2010 Python class, I found the following documentation:

'*' -- 0 or more occurrences of the pattern to its left

But when I tried the following

re.search(r'i*','biiiiiiiiiiiiiig').group()

I expected 'iiiiiiiiiiiiii' as output but got ''. Why?

解决方案

* means 0 or more but re.search would return only the first match. Here the first match is an empty string. So you get an empty string as output.



Change * to + to get the desired output.
>>> re.search(r'i*','biiiiiiiiiiiiiig').group()
''
>>> re.search(r'i+','biiiiiiiiiiiiiig').group()
'iiiiiiiiiiiiii'
Consider this example.
>>> re.search(r'i*','biiiiiiiiiiiiiig').group()
''
>>> re.search(r'i*','iiiiiiiiiiiiiig').group()
'iiiiiiiiiiiiii'
Here i* returns iiiiiiiiiiiiii because at first , the regex engine tries to match zero or more times of i. Once it finds  i  at the very first, it matches greedily all the i's like in the second example, so you get iiiiiiii as output and if the i is not at the first (consider this biiiiiiig string), i* pattern would match all the empty string before the every non-match, in our case it matches all the empty strings that exists before  b and g. Because re.search returns only the first match, you should get an empty string because of the non-match b at the first.

Why i got three empty strings as output in the below example?
>>> re.findall(r'i*','biiiiiiiiiiiiiig')
['', 'iiiiiiiiiiiiii', '', '']
As i explained earlier, for every non-match you should get an empty string as match. Let me explain. Regex engine parses the input from left to right.

First empty string as output is because the pattern i* won't match the character b but it matches the empty string which exists before the b.
Now the engine moves to the next character that is i which would be matched by our pattern i*, so it greedily matches the following i's . So you get iiiiiiiiiiiiii as the second.
After matching all the i's, it moves to the next character that is g which isn't matched by our pattern i* . So i* matches the empty string before the non-match g. That's the reason for the third empty string.
Now our pattern i* matches the empty string which exists before the end of the line. That's the reason for fourth empty string.


                        
这篇关于Python 正则表达式模式 * 未按预期工作的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

Python 正则表达式模式 * 未按预期工作 [英] Python regular expression pattern is not working as expected*

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python 正则表达式模式 * 未按预期工作 [英] Python regular expression pattern * is not working as expected

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python 正则表达式模式 * 未按预期工作 [英] Python regular expression pattern is not working as expected*

登录关闭