Python正则表达式列表使用列表 [英] Python regex list using list
问题描述
好,所以我有一些要用作正则表达式搜索的字符串列表.例如
import re
regex_strings = ['test1','test2','test3']
#Obviously this won't work here as is!
regex = re.compile(regex_strings)
我还有另一个字符串列表.例如
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
我想遍历"strgs"列表,并用regex对照"regex_strings"列表检查每个字符串.然后,如果有匹配项,则返回整个字符串.
我在这里挠了一下头,我不太确定解决此问题的最佳方法.任何建议将不胜感激!
致谢.
您可以像这样在正则表达式中使用|
运算符
re.compile("(" + "|".join(regex_strings) + ")")
因此,正则表达式变为这样的(test1|test2|test3)
.您可以在 http://regex101.com/r/pR5pU1 中查看该正则表达式的含义. >
样品运行:
import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
print [strg for strg in strgs if regex.search(strg)]
输出
['This is a test1', 'This is a test2', 'This is a test1', 'This is a test1', 'This is a test3']
:如果您只想返回匹配的零件,
import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
result = []
for strg in strgs:
temp = regex.search(strg)
if temp:
result.append(temp.group())
print result
输出
['test1', 'test2', 'test1', 'test1', 'test3']
OK So I have list of strings that I would to use as a regex search. e.g.
import re
regex_strings = ['test1','test2','test3']
#Obviously this won't work here as is!
regex = re.compile(regex_strings)
I also have another list of strings. e.g.
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
I want to iterate over the 'strgs' list and regex check each string against the 'regex_strings' list. Then, if there's a match, return the entire string.
I've been scratching my head here for a bit and I'm not quite sure the best way to approach this. Any suggestions would be really appreciated!
Regards.
You can use |
operator in regular expression like this
re.compile("(" + "|".join(regex_strings) + ")")
So, the regular expression becomes like this (test1|test2|test3)
. You can check the meaning of this regular expression here http://regex101.com/r/pR5pU1
Sample run:
import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
print [strg for strg in strgs if regex.search(strg)]
Output
['This is a test1', 'This is a test2', 'This is a test1', 'This is a test1', 'This is a test3']
Edit: If you want to return only the matched part,
import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
result = []
for strg in strgs:
temp = regex.search(strg)
if temp:
result.append(temp.group())
print result
Output
['test1', 'test2', 'test1', 'test1', 'test3']
这篇关于Python正则表达式列表使用列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!