具有多种模式的 Python Regex sub() [英] Python Regex sub() with multiple patterns

查看:44
本文介绍了具有多种模式的 Python Regex sub()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有任何方法可以将模式与 re.sub() 结合起来,而不是像下面这样使用倍数:

I'm wondering if there's any way to combine patterns with re.sub() instead of using multiples like below:

import re
s1 = "Please check with the store to confirm holiday hours."
s2 = ''' Hours:
            Monday: 9:30am - 6:00pm
Tuesday: 9:30am - 6:00pm
Wednesday: 9:30am - 6:00pm
Thursday: 9:30am - 6:00pm
Friday: 9:30am - 9:00pm
Saturday: 9:30am - 6:00pm
Sunday: 11:00am - 6:00pm

Please check with the store to confirm holiday hours.'''

strip1 = re.sub(s1, '', s2)
strip2 = re.sub('\t', '', strip1)
print(strip2)

所需的输出:

Hours:
Monday: 9:30am - 6:00pm
Tuesday: 9:30am - 6:00pm
Wednesday: 9:30am - 6:00pm
Thursday: 9:30am - 6:00pm
Friday: 9:30am - 9:00pm
Saturday: 9:30am - 6:00pm
Sunday: 11:00am - 6:00pm

推荐答案

如果您只是想删除特定的子字符串,您可以将模式与交替组合以进行一次删除:

If you're just trying to delete specific substrings, you can combine the patterns with alternation for a single pass removal:

pat1 = r"Please check with the store to confirm holiday hours."
pat2 = r'\t'
combined_pat = r'|'.join((pat1, pat2))
stripped = re.sub(combined_pat, '', s2)

如果模式"使用实际的正则表达式特殊字符会更复杂(因为那么你需要担心包装它们以确保在正确的位置交替中断),但对于简单的固定模式,这很简单.

It's more complicated if the "patterns" use actual regex special characters (because then you need to worry about wrapping them to ensure the alternation breaks at the right places), but for simple fixed patterns, it's simple.

如果你有真正的正则表达式,而不是固定模式,你可能会这样做:

If you had real regexes, rather than fixed patterns, you might do something like:

all_pats = [...]
combined_pat = r'|'.join(map(r'(?:{})'.format, all_pats))

因此任何正则表达式特价都保持分组,而不会在交替中流血".

so any regex specials remain grouped without possibly "bleeding" across an alternation.

这篇关于具有多种模式的 Python Regex sub()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆