你如何在 Python 的列表理解中使用正则表达式? [英] How do you use a regex in a list comprehension in Python?

查看:66
本文介绍了你如何在 Python 的列表理解中使用正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在单词列表中定位字符串的所有索引位置,并且我希望将值作为列表返回.我想找到字符串,如果它是单独的,或者它前面或后面是标点符号,但如果它是一个更大的词的子字符串,则不是.

I'm trying to locate all index positions of a string in a list of words and I want the values returned as a list. I would like to find the string if it is on its own, or if it is preceded or followed by punctuation, but not if it is a substring of a larger word.

以下代码仅捕获cow",而忽略了test;cow"和cow".

The following code only captures "cow" only and misses both "test;cow" and "cow."

myList = ['test;cow', 'one', 'two', 'three', 'cow.', 'cow', 'acow']
myString = 'cow'
indices = [i for i, x in enumerate(myList) if x == myString]
print indices
>> 5

我尝试更改代码以使用正则表达式:

I have tried changing the code to use a regular expression:

import re
myList = ['test;cow', 'one', 'two', 'three', 'cow.', 'cow', 'acow']
myString = 'cow'
indices = [i for i, x in enumerate(myList) if x == re.match('\W*myString\W*', myList)]
print indices

但这给出了一个错误:预期的字符串或缓冲区

But this gives an error: expected string or buffer

如果有人知道我做错了什么,我会很高兴听到.我有一种感觉,这与我试图在那里使用正则表达式的事实有关,因为它需要一个字符串.有解决办法吗?

If anyone knows what I'm doing wrong I'd be very happy to hear. I have a feeling it's something to do with the fact I'm trying to use a regular expression in there when it's expecting a string. Is there a solution?

我正在寻找的输出应该是:

The output I'm looking for should read:

>> [0, 4, 5]

谢谢

推荐答案

您不需要将 match 的结果赋值回 x.你的匹配应该在 x 而不是 list.

You don't need to assign the result of match back to x. And your match should be on x rather than list.

此外,您需要使用 re.search 而不是 re.match,因为您的正则表达式模式 '\W*myString\W*' 将不匹配第一个元素.那是因为 test;\W* 不匹配.实际上,您只需要测试紧跟和前面的字符,而不是完整的字符串.

Also, you need to use re.search instead of re.match, since your the regex pattern '\W*myString\W*' will not match the first element. That's because test; is not matched by \W*. Actually, you only need to test for immediate following and preceding character, and not the complete string.

所以,你可以在字符串周围使用word边界:

So, you can rather use word boundaries around the string:

pattern = r'\b' + re.escape(myString) + r'\b'
indices = [i for i, x in enumerate(myList) if re.search(pattern, x)]

这篇关于你如何在 Python 的列表理解中使用正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆