文本中多词匹配的算法 [英] Algorithm for multiple word matching in text

查看:24
本文介绍了文本中多词匹配的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一大堆单词(大约 10,000 个),我需要找出这些单词中是否有任何单词出现在给定的文本块中.

I have a large set of words (about 10,000) and I need to find if any of those words appear in a given block of text.

有没有比对文本块中的每个单词进行简单的文本搜索更快的算法?

Is there a faster algorithm than doing a simple text search for each of the words in the block of text?

推荐答案

将 10,000 个单词输入到一个哈希表中,然后检查文本块中的每个单词是否有一个条目.

input the 10,000 words into a hashtable then check each of the words in the block of text if its hash has an entry.

虽然我不知道更快,但只是另一种方法(取决于您要搜索的单词数量).

Faster though I don't know, just another method (would depend on how many words you are searching for).

简单的 perl 示例:

simple perl examp:

my $word_block = "the guy went afk after being popped by a brownrabbit";
my %hash = ();
my @words = split /s/, $word_block;
while(<DATA>) { chomp; $hash{$_} = 1; }
foreach $word (@words)
{
    print "found word: $word
" if exists $hash{$word};
}

__DATA__
afk
lol
brownrabbit
popped
garbage
trash
sitdown

这篇关于文本中多词匹配的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆