在python中删除子字符串时识别字符串 [英] Identify strings while removing substrings in python

查看：34 发布时间：2021/7/6 20:37:49 python regex

本文介绍了在python中删除子字符串时识别字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个单词字典，其频率如下.

I have a dictionary of words with their frequencies as follows.

mydictionary = {'yummy tim tam':3, 'milk':2, 'chocolates':5, 'biscuit pudding':3, 'sugar':2}

我有一组字符串(删除了标点符号)如下.

I have a set of strings (removed punctuation marks) as follows.

recipes_book = "For todays lesson we will show you how to make biscuit pudding using 
yummy tim tam milk and rawsugar"

在上面的字符串中，我只需要通过参考字典输出饼干布丁"、美味的蒂姆"和牛奶".不是糖，因为它是字符串中的粗糖.

In the above string I need output only "biscuit pudding", "yummy tim tam" and "milk" by referring the dictionary. NOT sugar, because its rawsugar in the string.

然而，我目前使用的代码也输出糖.

However, the code I am currently using outputs sugar as well.

mydictionary = {'yummy tim tam':3, 'milk':2, 'chocolates':5, 'biscuit pudding':3, 'sugar':2}
recipes_book = "For today's lesson we will show you how to make biscuit pudding using yummy tim tam milk and rawsugar"
searcher = re.compile(r'{}'.format("|".join(mydictionary.keys())), flags=re.I | re.S)

for match in searcher.findall(recipes_book):
    print(match)

如何避免使用这样的子字符串，而只考虑一个完整的标记，例如牛奶".请帮帮我.

How to avoid using sub-strings like that and only consider one full tokens such as 'milk'. Please help me.

推荐答案

您可以使用正则表达式字边界更新您的代码:

You can update your code with regex word boundary:

mydictionary = {'yummy tim tam':3, 'milk':2, 'chocolates':5, 'biscuit pudding':3, 'sugar':2}
recipes_book = "For today's lesson we will show you how to make biscuit pudding using yummy tim tam milk and rawsugar"
searcher = re.compile(r'{}'.format("|".join(map(lambda x: r'\b{}\b'.format(x), mydictionary.keys()))), flags=re.I | re.S)

for match in searcher.findall(recipes_book):
    print(match)

输出:

biscuit pudding
yummy tim tam
milk

这篇关于在python中删除子字符串时识别字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在python中删除子字符串时识别字符串 [英] Identify strings while removing substrings in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在python中删除子字符串时识别字符串 [英] Identify strings while removing substrings in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭