使用字典替换文本数据 [英] Text data replacement using dictionary

查看：275 发布时间：2020/5/4 4:02:50 python dictionary nlp lookup

本文介绍了使用字典替换文本数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

具有以下结构的数据框-

Dataframe with below structure -

ID text
0  Language processing in python th is great
1  Relace the string

字典名为自定义修复程序

Dictionary named custom fix

{'Relace': 'Replace', 'th' : 'three'}

尝试了代码，输出结果为- 电流输出-

Tried the code and the output is coming as - Current output -

ID text
0  Language processing in pythirdon three is great
1  Replace threee string

代码:

def multiple_replace(dict, text):
  # Create a regular expression  from the dictionary keys
  regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))

  # For each match, look-up corresponding value in dictionary
  return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text) 

df['col1'] = df.apply(lambda row: multiple_replace(custom_fix, row['text']), axis=1)

预期输出-

ID text
0  Language processing in python three is great
1  Replace the string

推荐答案

我不是正则表达式专家，也许这不是最好的解决方案，但使用正则表达式中的单词边界\b应该可以解决问题，这里是固定功能:

I'm not an regex expert, and maybe this is not the best solution, but using word boundaries \b in your regex should fix the problem, here the fixed function:

def multiple_replace(d, text):
    # Create a regular expression  from the dictionary keys
    regex = re.compile("(%s)" % "|".join(["\\b" + x + "\\b" for x in d.keys()]))

    # For each match, look-up corresponding value in dictionary
    return regex.sub(lambda mo: d[mo.string[mo.start():mo.end()]], text)

这篇关于使用字典替换文本数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用字典替换文本数据 [英] Text data replacement using dictionary

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用字典替换文本数据 [英] Text data replacement using dictionary

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭