仅当字符串包含每个列表中的单词时,RegEx才匹配 [英] RegEx that matches only if a string contains a word from each list

查看:58
本文介绍了仅当字符串包含每个列表中的单词时,RegEx才匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一种软​​件,该软件必须检查文本中是否包含来自指定列表的单词和来自另一个指定列表的单词.

I'm developing a software that has to check if a text contains a word taken from a specified list and a word taken from another specified list.

示例:

list 1: dog, cat
list 2: house, tree

以下文本必须匹配:

the dog is in the house -> contains dog and house
my house is full of dogs -> contains dog and house
the cat is on the tree -> contains cat and tree

以下示例必须匹配

the frog is in the house -> there is no word from the first list
Boby is the name of my dog -> there is no word from the second list
Outside my house there is a tree -> there is no word from the first list

为快速解决该问题,我列出了以下模式:

To solve quickly the problem I've made a list of pattern like:

dog.*house, house.*dog, cat.*house, ...

但是我很确定有一种更聪明的方法...

but I'm pretty sure there is an smarter way...

推荐答案

您可以为每组替代项使用替代(|),并为订单使用包装器替代.所以:

You can use an alternation (|) for each of the sets of alternatives, and a wrapper alternation for the order. So:

(?:(?:dog|cat).*(?:house|tree))|(?:(?:house|tree).*(?:dog|cat))

JavaScript示例(非捕获组和替换在Java和JavaScript中的工作原理相同):

JavaScript Example (non-capturing groups and alternations work the same in Java and JavaScript):

var tests = [
    {match: true,  text: "the dog is in the house -> contains dog and house"},
    {match: true,  text: "my house is full of dogs -> contains dog and house"},
    {match: true,  text: "the cat is on the tree -> contains cat and tree"},
    {match: false, text: "the frog is in the house -> there is no word from the first list"},
    {match: false, text: "Boby is the name of my dog -> there is no word from the second list"},
    {match: false, text: "Outside my house there is a tree -> there is no word from the first list"}
];
var rex = /(?:(?:dog|cat).*(?:house|tree))|(?:(?:house|tree).*(?:dog|cat))/;
tests.forEach(function(test) {
  var result = rex.test(test.text);
  if (!!result == !!test.match) {
    console.log('GOOD: "' + test.text + '": ' + result);
  } else {
    console.log('BAD: "' + test.text + '": ' + result + ' (expected ' + test.match + ')');
  }
});

.as-console-wrapper {
  max-height: 100% !important;
}

请注意,在上文中,我们不是在检查单词,而只是检查字母序列.如果您希望它是实际的单词,则需要添加断词断言或类似内容.留给读者练习……

Note that in the above we're not checking for words, just sequences of letters. If you want it to be actual words, you'll need to add word break assertions or similar. Left as an exercise to the reader...

这篇关于仅当字符串包含每个列表中的单词时,RegEx才匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆