Python正则表达式列表使用列表 [英] Python regex list using list

查看:676
本文介绍了Python正则表达式列表使用列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好,所以我有一些要用作正则表达式搜索的字符串列表.例如

import re
regex_strings = ['test1','test2','test3']

#Obviously this won't work here as is!  
regex = re.compile(regex_strings)

我还有另一个字符串列表.例如

strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']

我想遍历"strgs"列表,并用regex对照"regex_strings"列表检查每个字符串.然后,如果有匹配项,则返回整个字符串.

我在这里挠了一下头,我不太确定解决此问题的最佳方法.任何建议将不胜感激!

致谢.

解决方案

您可以像这样在正则表达式中使用|运算符

re.compile("(" + "|".join(regex_strings) + ")")

因此,正则表达式变为这样的(test1|test2|test3).您可以在 http://regex101.com/r/pR5pU1 中查看该正则表达式的含义.

样品运行:

import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
print [strg for strg in strgs if regex.search(strg)]

输出

['This is a test1', 'This is a test2', 'This is a test1', 'This is a test1', 'This is a test3']

:如果您只想返回匹配的零件,

import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
result = []
for strg in strgs:
    temp = regex.search(strg)
    if temp:
        result.append(temp.group())
print result

输出

['test1', 'test2', 'test1', 'test1', 'test3']

OK So I have list of strings that I would to use as a regex search. e.g.

import re
regex_strings = ['test1','test2','test3']

#Obviously this won't work here as is!  
regex = re.compile(regex_strings)

I also have another list of strings. e.g.

strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']

I want to iterate over the 'strgs' list and regex check each string against the 'regex_strings' list. Then, if there's a match, return the entire string.

I've been scratching my head here for a bit and I'm not quite sure the best way to approach this. Any suggestions would be really appreciated!

Regards.

解决方案

You can use | operator in regular expression like this

re.compile("(" + "|".join(regex_strings) + ")")

So, the regular expression becomes like this (test1|test2|test3). You can check the meaning of this regular expression here http://regex101.com/r/pR5pU1

Sample run:

import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
print [strg for strg in strgs if regex.search(strg)]

Output

['This is a test1', 'This is a test2', 'This is a test1', 'This is a test1', 'This is a test3']

Edit: If you want to return only the matched part,

import re
regex_strings = ['test1','test2','test3']
regex = re.compile("(" + "|".join(regex_strings) + ")")
strgs = ['This is a test1','This is a test2','This is a test1','This is a test1','This is a test3']
result = []
for strg in strgs:
    temp = regex.search(strg)
    if temp:
        result.append(temp.group())
print result

Output

['test1', 'test2', 'test1', 'test1', 'test3']

这篇关于Python正则表达式列表使用列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆