Javascript RegExp用于具有特殊字符匹配的精确多个单词 [英] Javascript RegExp for exact multiple words with special characters match

查看:38
本文介绍了Javascript RegExp用于具有特殊字符匹配的精确多个单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 RegExp 进行多词匹配.它具有动态值,因此当出现像("这样的特殊字符时,它会将其作为表达式并显示 Uncaught SyntaxError: Invalid regular expression error.

I'm using RegExp for multiple words match. It has dynamic values so when a special character like "(" comes it takes that as an expression and shows Uncaught SyntaxError: Invalid regular expression error.

let text = 'working text and (not working text'
let findTerm = ['working text', '(not working text']
let replaceFromRegExp = new RegExp('\\b'+`(${findTerm.join("|")})`+'\\b', 'g')
text = text.replace(replaceFromRegExp, match => "<mark>" + match + "</mark>")
console.log(text)

推荐答案

A \b 词边界 匹配以下三个位置中的任意一个:

A \b word boundary matches any of the following three positions:

  1. 在字符串的第一个字符之前,如果第一个字符是单词字符.
  2. 在字符串的最后一个字符之后,如果最后一个字符是单词字符.
  3. 在字符串中的两个字符之间,其中一个是单词字符,另一个不是单词字符.您需要通用的词边界,在搜索词之前需要一个非单词字符或字符串开头,在搜索字符串之后需要一个非单词字符或字符串结尾.

请注意,您还需要按长度降序对 findTerm 项进行排序,以避免出现术语重叠问题.

Note you need to also sort the findTerm items in the descending order by length to avoid overlapping term issues.

最后,不要忘记转义要在正则表达式模式中使用的 findTerm 项.

Finally, do not forget to escape the findTerm items to be used in a regex pattern.

你可以使用

let text = 'working text and (not working text'
let findTerm = ['working text', '(not working text']
findTerm.sort((a, b) => b.length - a.length);
let replaceFromRegExp = new RegExp(String.raw`(?:\B(?!\w)|\b(?=\w))(?:${findTerm.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join("|")})(?:(?<=\w)\b|(?<!\w)\B)`, 'g')
// If the boundaries for special chars should not be checked remove \B:
// let replaceFromRegExp = new RegExp(String.raw`(?:(?!\w)|\b(?=\w))(?:${findTerm.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join("|")})(?:(?<=\w)\b|(?<!\w))`, 'g')
console.log(replaceFromRegExp)
text = text.replace(replaceFromRegExp, "<mark>$&</mark>")
console.log(text)

请注意,"<mark>$&</mark>>>> 是表示 match =>;<标记>"+ match + "",因为 $& 是对字符串替换模式中整个匹配值的反向引用.

Note that "<mark>$&</mark>" is a shorter way of saying match => "<mark>" + match + "</mark>", as $& is a backreference to the whole match value in a string replacement pattern.

正则表达式是

/(?:\B(?!\w)|\b(?=\w))(?:\(not working text|working text)(?:(?<=\w)\b|(?<!\w)\B)/g

/(?:(?!\w)|\b(?=\w))(?:\(not working text|working text)(?:(?<=\w)\b|(?<!\w))/g

查看 regex #1 演示regex #2 演示.详情:

  • (?:\B(?!\w)|\b(?=\w)) - 如果下一个字符不是单词字符,则为非单词边界,或者如果下一个字符是字符字符,则单词边界
  • (?:(?!\w)|\b(?=\w)) - 下一个字符必须是非字字符,或者必须立即没有字字符当前位置的左边,下一个必须是一个单词char(如果词条以特殊字符开头,则不需要分界)
  • (?:\(not working text|working text) - 与 findTerm 数组中设置的替代模式之一匹配的非捕获组
  • (?:(?<=\w)\b|(?<!\w)\B) - 如果前面的字符是字符字符,则为词边界,或者如果前面的字符不是单词字符,则为非单词边界
  • (?:(?<=\w)\b|(?<!\w)) - 如果前一个字符是字符字符,则下一个字符不能是字符word char,或者前面的char不应该是word char(如果词条以特殊字符结尾,则不需要边界)
  • (?:\B(?!\w)|\b(?=\w)) - either a non-word boundary if the next char is not a word char, or a word boundary if the next char is a word char
  • (?:(?!\w)|\b(?=\w)) - either the next char must be a non-word char, or there must be no word char immediately to the left of the current location, and the next one must be a word char (if the term starts with a special char, no boundary is required)
  • (?:\(not working text|working text) - a non-capturing group matching one of the alternative patterns set in the findTerm array
  • (?:(?<=\w)\b|(?<!\w)\B) - either a word boundary if the preceding char is a word char, or a non-word boundary if the preceding char is not a word char
  • (?:(?<=\w)\b|(?<!\w)) - if the previous char is a word char, the next one must not be a word char, or the previous char should not be a word char (if the term ends with a special char, no boundary is required)

这篇关于Javascript RegExp用于具有特殊字符匹配的精确多个单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆