Javascript Regex Word Boundary带有可选的非单词字符 [英] Javascript Regex Word Boundary with optional non-word character

查看:113
本文介绍了Javascript Regex Word Boundary带有可选的非单词字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望在字符串中找到关键字匹配。我试图使用单词边界,但这可能不是该解决方案的最佳情况。关键字可以是任何单词,并且可以在前面加上非单词字符。字符串可以是任何字符串,并且可以包含数组中的所有这三个字,但我只应匹配关键字:

I am looking to find a keyword match in a string. I am trying to use word boundary, but this may not be the best case for that solution. The keyword could be any word, and could be preceded with a non-word character. The string could be any string at all and could include all three of these words in the array, but I should only match on the keyword:

['hello', '#hello', '@hello'];

这是我的代码,其中包含在 post

Here is my code, which includes an attempt found in post:

let userStr = 'why hello there, or should I say #hello there?';

let keyword = '#hello';

let re = new RegExp(`/(#\b${userStr})\b/`);

re.exec(keyword);




  • 如果字符串始终以#开头,那就太好了,但是没有。

  • 然后我尝试了这个 /(#?\ b $ {userStr})\ b / ,但是如果该字符串以开头,它尝试匹配 ## hello

  • matchThis str可以是数组中的3个示例中的任何一个,userStr可能包含 matchThis 但只有一个是准确的

    • This would be great if the string always started with #, but it does not.
    • I then tried this /(#?\b${userStr})\b/, but if the string does start with #, it tries to match ##hello.
    • The matchThis str could be any of the 3 examples in the array, and the userStr may contain several variations of the matchThis but only one will be exact
    • 推荐答案

      你需要在这里说明3件事:

      You need to account for 3 things here:


      • 重点是 \b 字边界是一个上下文 - 依赖构造,如果你的输入不总是只有字母数字,你需要明确的单词边界

      • 你需要在构造函数RegExp表示法中双重转义特殊字符

      • 当你将变量传递给正则表达式时,你需要确保所有特殊字符都被正确转义。

      • The main point is that a \b word boundary is a context-dependent construct, and if your input is not always alphanumeric-only, you need unambiguous word boundaries
      • You need to double escape special chars inside constructor RegExp notation
      • As you pass a variable to a regex, you need to make sure all special chars are properly escaped.

      使用

      let userStr = 'why hello there, or should I say #hello there?';
      let keyword = '#hello';
      let re_pattern = `(?:^|\\W)(${keyword.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')})(?!\\w)`;
      let res = [], m;
      
      // To find a single (first) match
      console.log((m=new RegExp(re_pattern).exec(userStr)) ? m[1] : "");
      
      // To find multiple matches:
      let rx = new RegExp(re_pattern, "g");
      while (m=rx.exec(userStr)) {
          res.push(m[1]);
      }
      console.log(res);

      模式说明


      • (?:^ | \\ W) - 匹配字符串开头或任何非单词字符的非捕获字符串

      • ($ {keyword.replace(/ [ - \ / \ \ ^ $ * +?。()| [\] {}] / g,'\\ $&')}) - 第1组:关键字带有转义特殊字符的值

      • (?!\\\\) - 如果匹配失败则为负面预测当前位置右侧有一个字词char。

      • (?:^|\\W) - a non-capturing string matching the start of string or any non-word char
      • (${keyword.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')}) - Group 1: a keyword value with escaped special chars
      • (?!\\w) - a negative lookahead that fails the match if there is a word char immediately to the right of the current location.

      这篇关于Javascript Regex Word Boundary带有可选的非单词字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆