正则表达式的单词边界 - 无法提取所有单词 [英] Word boundary with regex - cannot extract all words

查看：83 发布时间：2021/7/6 19:50:46 python regex string findall boundary

本文介绍了正则表达式的单词边界 - 无法提取所有单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要提取双Male-Cat:

a = "Male-Cat Male-Cat Male-Cat-Female"
b = re.findall(r'(?:\s|^)Male-Cat(?:\s|$)', a)
print (b)
['Male-Cat ']

c = re.findall(r'\bMale-Cat\b', a)
print (c)
['Male-Cat', 'Male-Cat', 'Male-Cat']

我需要提取树时间Male-Cat:

a = "Male-Cat Male-Cat Male-Cat"
b = re.findall(r'(?:\s|^)Male-Cat(?:\s|$)', a)
print (b)
['Male-Cat ', ' Male-Cat']

c = re.findall(r'\bMale-Cat\b', a)
print (c)
['Male-Cat', 'Male-Cat', 'Male-Cat']

通过第一种方式正确解析的另一个字符串:

Another strings which are parsed correctly by first way:

a = 'Male-Cat Female-Cat Male-Cat-Female Male-Cat'
a = 'Male-Cat-Female'
a = 'Male-Cat'

缺少什么?你能解释一下什么是错的，什么是正确的方法吗?

Something missing? Can you explain what is wrong and what is correct way?

推荐答案

使用环视来提取空白边界内的单词:

Use lookarounds to extract words inside whitespace boundaries:

r'(?<!\S)Male-Cat(?!\S)'

查看在线正则表达式演示

详情

(?<!\S) - 空格或字符串的开头必须立即出现在当前位置的左侧
Male-Cat - 要搜索的词
(?!\S) - 空格或字符串结尾必须立即出现在当前位置的右侧

(?<!\S) - a whitespace or start of string must appear immediately to the left of the current location
Male-Cat - the term to search for
(?!\S) - a whitespace or end of string must appear immediately to the right of the current location

由于 (? 和 (?!\S) 是零宽度断言，所以不会消耗空格，并且连续匹配会被找到.


Since (?<!\S) and (?!\S) are zero-width assertions, the whitespace won't be consumed, and consecutive matches will get found.

                        这篇关于正则表达式的单词边界 - 无法提取所有单词的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

正则表达式的单词边界 - 无法提取所有单词 [英] Word boundary with regex - cannot extract all words

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

正则表达式的单词边界 - 无法提取所有单词 [英] Word boundary with regex - cannot extract all words

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭