Python研究 [英] Python re.search

查看:36
本文介绍了Python研究的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含

的字符串变量

string = "123hello456world789"

字符串不包含空格.我想写一个正则表达式,只打印包含(a-z)的单词我尝试了一个简单的正则表达式

pat = "([a-z]+){1,}"match = re.search(r""+pat,word,re.DEBUG)

匹配对象只包含单词Hello,不匹配单词World.

什么时候使用 re.findall() 我可以同时得到 HelloWorld.

我的问题是为什么我们不能用 re.search() 做到这一点?

如何用 re.search() 做到这一点?

解决方案

re.search() 在字符串中找到模式一次文档:

<块引用>

扫描字符串寻找常规位置表达式模式产生一个匹配,并返回一个对应的匹配对象实例.如果字符串中没有位置匹配,则返回 None图案;请注意,这与找到零长度不同匹配字符串中的某个点.

为了匹配每次出现,您需要re.findall()文档:

<块引用>

返回字符串中模式的所有非重叠匹配,作为列表字符串.从左到右扫描字符串,并返回匹配项按照找到的顺序.如果模式中存在一个或多个组,返回组列表;这将是一个元组列表,如果模式有不止一组.结果中包含空匹配项除非他们触及另一场比赛的开始.

示例:

<预><代码>>>>进口重新>>>正则表达式 = re.compile(r'([a-z]+)', re.I)>>># 使用搜索我们只能得到第一项.>>>regex.search("123hello456world789").groups()('你好',)>>># 使用 findall 我们得到每个项目.>>>regex.findall("123hello456world789")['你好,世界']

<小时>

更新:

由于您的重复问题(如本链接所述) 我也在这里添加了我的其他答案:

<预><代码>>>>进口重新>>>regex = re.compile(r'([a-z][a-z-\']+[a-z])')>>>regex.findall("HELLO W-O-R-L-D") # 这个有大写[] # 这里没有结果,因为字符串是大写的>>>regex.findall("HELLO W-O-R-L-D".lower()) # 让小写['hello', 'w-o-r-l-d'] # 现在我们有了结果>>>regex.findall("123hello456world789")['你好,世界']

如您所见,您提供的第一个示例失败的原因是大写,您可以简单地添加 re.IGNORECASE 标志,尽管您提到匹配应该是仅小写.

I have a string variable containing

string = "123hello456world789"

string contain no spacess. I want to write a regex such that prints only words containing(a-z) I tried a simple regex

pat = "([a-z]+){1,}"
match = re.search(r""+pat,word,re.DEBUG)

match object contains only the word Hello and the word World is not matched.

When is used re.findall() I could get both Hello and World.

My question is why we can't do this with re.search()?

How do this with re.search()?

解决方案

re.search() finds the pattern once in the string, documenation:

Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

In order to match every occurrence, you need re.findall(), documentation:

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

Example:

>>> import re
>>> regex = re.compile(r'([a-z]+)', re.I)
>>> # using search we only get the first item.
>>> regex.search("123hello456world789").groups()
('hello',)
>>> # using findall we get every item.
>>> regex.findall("123hello456world789")
['hello', 'world']


UPDATE:

Due to your duplicate question (as discussed at this link) I have added my other answer here as well:

>>> import re
>>> regex = re.compile(r'([a-z][a-z-\']+[a-z])')
>>> regex.findall("HELLO W-O-R-L-D") # this has uppercase
[]  # there are no results here, because the string is uppercase
>>> regex.findall("HELLO W-O-R-L-D".lower()) # lets lowercase
['hello', 'w-o-r-l-d'] # now we have results
>>> regex.findall("123hello456world789")
['hello', 'world']

As you can see, the reason why you were failing on the first sample you provided is because of the uppercase, you can simply add the re.IGNORECASE flag, though you mentioned that matches should be lowercase only.

这篇关于Python研究的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆