查找列表单词中文本中单词的出现 [英] Finding occurrences of words in text which are in a list words

查看:120
本文介绍了查找列表单词中文本中单词的出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可能重复:
检查另一个字符串中是否存在多个字符串

Possible Duplicate:
Check if multiple strings exist in another string

说我有一个允许的单词/短语的列表:

Say I have a list of allowed words/phrases:

'Stack'
'Overflow'
'Stack Overflow'
'Stack Exchange'
'Exchange'

以及以下要解析的文本:

and the following text to parse:

'Hello, and welcome to Stack Overflow. 
 Here are some words which should match: Stack, Exchange.'

我想获取在允许的列表中找到的单词列表:

I'd like to get the list of words which are found in the allowed list:

  • 堆栈溢出"
  • 堆栈"
  • 交流"

获得结果的最佳方法是什么?

What would be the best way to achieve the result?

我将使用的允许列表至少为一千个单词/短语.

The allowed list I'll be using could be at least a thousand words/phrases.

推荐答案

将单词放入列表中并在使用后

Put the words in a list and after use

def intersect(x, y):
    return list(set(x) & set(y))
word_list_text=string.split(text)
words_found={}
words_found=intersect(word_list_text, words)

这篇关于查找列表单词中文本中单词的出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆