Python研究 [英] Python re.search
问题描述
我有一个包含
的字符串变量string = "123hello456world789"
字符串不包含空格.我想写一个正则表达式,只打印包含(a-z)的单词我尝试了一个简单的正则表达式
pat = "([a-z]+){1,}"match = re.search(r""+pat,word,re.DEBUG)
匹配对象只包含单词Hello
,不匹配单词World
.
什么时候使用 re.findall()
我可以同时得到 Hello
和 World
.
我的问题是为什么我们不能用 re.search()
做到这一点?
如何用 re.search()
做到这一点?
re.search()
在字符串中找到模式一次,文档:
扫描字符串寻找常规位置表达式模式产生一个匹配,并返回一个对应的匹配对象实例.如果字符串中没有位置匹配,则返回 None图案;请注意,这与找到零长度不同匹配字符串中的某个点.
为了匹配每次出现,您需要re.findall()
、文档:
返回字符串中模式的所有非重叠匹配,作为列表字符串.从左到右扫描字符串,并返回匹配项按照找到的顺序.如果模式中存在一个或多个组,返回组列表;这将是一个元组列表,如果模式有不止一组.结果中包含空匹配项除非他们触及另一场比赛的开始.
示例:
<预><代码>>>>进口重新>>>正则表达式 = re.compile(r'([a-z]+)', re.I)>>># 使用搜索我们只能得到第一项.>>>regex.search("123hello456world789").groups()('你好',)>>># 使用 findall 我们得到每个项目.>>>regex.findall("123hello456world789")['你好,世界']<小时>
更新:
由于您的重复问题(如本链接所述) 我也在这里添加了我的其他答案:
<预><代码>>>>进口重新>>>regex = re.compile(r'([a-z][a-z-\']+[a-z])')>>>regex.findall("HELLO W-O-R-L-D") # 这个有大写[] # 这里没有结果,因为字符串是大写的>>>regex.findall("HELLO W-O-R-L-D".lower()) # 让小写['hello', 'w-o-r-l-d'] # 现在我们有了结果>>>regex.findall("123hello456world789")['你好,世界']如您所见,您提供的第一个示例失败的原因是大写,您可以简单地添加 re.IGNORECASE
标志,尽管您提到匹配应该是仅小写.
I have a string variable containing
string = "123hello456world789"
string contain no spacess. I want to write a regex such that prints only words containing(a-z) I tried a simple regex
pat = "([a-z]+){1,}"
match = re.search(r""+pat,word,re.DEBUG)
match object contains only the word Hello
and the word World
is not matched.
When is used re.findall()
I could get both Hello
and World
.
My question is why we can't do this with re.search()
?
How do this with re.search()
?
re.search()
finds the pattern once in the string, documenation:
Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
In order to match every occurrence, you need re.findall()
, documentation:
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.
Example:
>>> import re
>>> regex = re.compile(r'([a-z]+)', re.I)
>>> # using search we only get the first item.
>>> regex.search("123hello456world789").groups()
('hello',)
>>> # using findall we get every item.
>>> regex.findall("123hello456world789")
['hello', 'world']
UPDATE:
Due to your duplicate question (as discussed at this link) I have added my other answer here as well:
>>> import re
>>> regex = re.compile(r'([a-z][a-z-\']+[a-z])')
>>> regex.findall("HELLO W-O-R-L-D") # this has uppercase
[] # there are no results here, because the string is uppercase
>>> regex.findall("HELLO W-O-R-L-D".lower()) # lets lowercase
['hello', 'w-o-r-l-d'] # now we have results
>>> regex.findall("123hello456world789")
['hello', 'world']
As you can see, the reason why you were failing on the first sample you provided is because of the uppercase, you can simply add the re.IGNORECASE
flag, though you mentioned that matches should be lowercase only.
这篇关于Python研究的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!