正则表达式多词搜索 [英] Regex multi word search

查看:68
本文介绍了正则表达式多词搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用什么来搜索字符串中的多个单词?我希望逻辑运算是 AND 以便所有单词都在字符串中的某个地方.我有一堆无意义的段落和一个简单的英文段落,我想通过指定几个常用词来缩小范围,例如the"和and",但希望它与我指定的所有词相匹配.

What do I use to search for multiple words in a string? I would like the logical operation to be AND so that all the words are in the string somewhere. I have a bunch of nonsense paragraphs and one plain English paragraph, and I'd like to narrow it down by specifying a couple common words like, "the" and "and", but would like it match all words I specify.

推荐答案

也许使用 识别英语的语言识别图表会起作用.一些快速测试似乎有效(假设段落仅由换行符分隔).

Maybe using a language recognition chart to recognize english would work. Some quick tests seem to work (this assumes paragraphs separated by newlines only).

正则表达式将匹配这些条件中的任何一个...... \bword\b 是由边界分隔的单词 word\b 是一个单词结尾,并且 word 将在要匹配的段落的任何位置匹配它.

The regexp will match one of any of those conditions... \bword\b is word separated by boundaries word\b is a word ending and just word will match it in any place of the paragraph to be matched.

my @paragraphs = split(/\n/,$text);
for my $p (@paragraphs) {
    if ($p =~ m/\bthe\b|\band\b|\ban\b|\bin\b|\bon\b|\bthat\b|\bis\b|\bare\b|th|sh|ough|augh|ing\b|tion\b|ed\b|age\b|’s\b|’ve\b|n’t\b|’d\b/) {
       print "Probable english\n$p\n";
    }
}

这篇关于正则表达式多词搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆