从段落中找到匹配单词的最有效方法 [英] Most efficient way to find matching words from paragraph

查看:58
本文介绍了从段落中找到匹配单词的最有效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个段落必须针对不同的关键字进行解析.例如,段落:

I have a Paragraph that I have to parse for different keywords. For example, Paragraph:

我想改变世界.想让它成为一个更好的生活场所.和平,爱与和谐.这就是生活的全部.我们可以使我们的世界一个不错的住所"

"I want to make a change in the world. Want to make it a better place to live. Peace, Love and Harmony. It is all life is all about. We can make our world a good place to live"

我的关键字是

世界",地球",地方"

"world", "earth", "place"

我应该在有比赛和有多少次的情况下报告.

I should report whenever I have a match and how many times.

输出应为:

世界" 2次,地点" 1次

"world" 2 times and "place" 1 time

当前,我只是将段落字符串转换为字符数组,然后将每个关键字与所有数组内容进行匹配.这浪费了我的资源.请为我提供一种有效的方法.(我正在使用PHP)

Currently, I am just converting Paragraph strings to array of characters and then matching each keyword with all of the array contents. Which is wasting my resources. Please guide me for an efficient way.( I am using PHP)

推荐答案

正如@CasimiretHippolyte所说,正则表达式是更好的方法,因为单词边界.使用 i flag .与preg_match_all 返回值:

As @CasimiretHippolyte commented, regex is the better means as word boundaries can be used. Further caseless matching is possible using the i flag. Use with preg_match_all return value:

返回完整模式匹配的数量(可能为零),如果发生错误,则返回FALSE.

Returns the number of full pattern matches (which might be zero), or FALSE if an error occurred.

匹配一个单词的模式是:/\ bword \ b/i .生成一个数组,其中键是搜索 $ words 中的单词值,值是映射的单词数,preg_match_all返回:

The pattern for matching one word is: /\bword\b/i. Generate an array where the keys are the word values from search $words and values are the mapped word-count, that preg_match_all returns:

$words = array("earth", "world", "place", "foo");

$str = "at Earth Hour the world-lights go out and make every place on the world dark";

$res = array_combine($words, array_map( function($w) USE (&$str) { return
       preg_match_all('/\b'.preg_quote($w,'/').'\b/i', $str); }, $words));

print_r($ res); 在eval.in进行测试 输出到:

数组([地球] => 1[世界] => 2[地方] => 1[foo] => 0)

Array ( [earth] => 1 [world] => 2 [place] => 1 [foo] => 0 )

使用 preg_quote 来转义不必要的单词,如果您知道,它们不包含任何特价商品.为了将内联匿名函数与 array_combine 一起使用,需要使用 PHP 5.3 .

Used preg_quote for escaping the words which is not necessary, if you know, they don't contain any specials. For the use of inline anonymous functions with array_combine PHP 5.3 is required.

这篇关于从段落中找到匹配单词的最有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆