使用字典替换字符串-正则表达式 [英] Replace a string using dictionary - regex

查看:140
本文介绍了使用字典替换字符串-正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个with语字典及其含义,我想替换文本中的所有the语.

I have a dictionary of slangs with their meanings and I want to replace all the slangs in my text.

我找到了部分有效的解决方案 https://stackoverflow.com/a/2400577

I have found partially working solution https://stackoverflow.com/a/2400577

目前,我的代码如下:

import re

myText = 'brb some sample text I lov u. I need some $$ for 2mw.'

dictionary = {
  'brb': 'be right back',
  'lov u': 'love you',
  '$$': 'money',
  '2mw': 'tomorrow'
}

pattern = re.compile(r'\b(' + '|'.join(re.escape(key) for key in dictionary.keys()) + r')\b')
result = pattern.sub(lambda x: dictionary[x.group()], myText)

print(result)

输出:

be right back some sample text I love you. I need some $$ for tomorrow.

如您所见,唱歌$$并没有改变,我知道这是由于\b语法所致.如何更改正则表达式以实现我的目标?

As you can see sings $$ haven't changed and I know it is due to \b syntax. How can I change my regex to achieve my goal?

推荐答案

用环视替换单词边界,以检查搜索短语周围是否有任何字符字符

Replace the word boundaries with lookarounds that check for any word chars around the search phrase

pattern = re.compile(r'(?<!\w)(' + '|'.join(re.escape(key) for key in dictionary.keys()) + r')(?!\w)')

请参见 Python演示

如果在当前位置的左侧紧跟着一个字符char,则(?<!\w)负向后查找将使匹配失败;如果在当前位置的紧随其后的位置,则(?!\w)负向查找将使匹配失败.

The (?<!\w) negative lookbehind fails the match if there is a word char immediately to the left of the current location and the (?!\w) negative lookahead fails the match if there is a word char immediately to the right of the current location.

如果只需要在空白字符和字符串开头/结尾之间匹配搜索词组,则将(?<!\w)替换为(?<!\S),将(?!\w)替换为(?!\S).

Replace (?<!\w) with (?<!\S) and (?!\w) with (?!\S) if you need to only match search phrases in between whitespace chars and start/end of string.

这篇关于使用字典替换字符串-正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆