Javascript正则表达式适用于不在某些字符之间的所有单词 [英] Javascript Regex for all words not between certain characters
问题描述
我正在尝试返回不包含在方括号之间的所有单词的计数.所以给定..
I'm trying to return a count of all words NOT between square brackets. So given ..
[don't match these words] but do match these
最后四个单词的计数为4.
I get a count of 4 for the last four words.
这在.net中有效:
\b(?<!\[)[\w']+(?!\])\b
但是它在Javascript中不起作用,因为它不支持向后看
but it won't work in Javascript because it doesn't support lookbehind
对纯js正则表达式解决方案有什么想法吗?
Any ideas for a pure js regex solution?
推荐答案
好的,我认为这应该可行:
Ok, I think this should work:
\[[^\]]+\](?:^|\s)([\w']+)(?!\])\b|(?:^|\s)([\w']+)(?!\])\b
您可以在这里进行测试:
http://regexpal.com/
You can test it here:
http://regexpal.com/
如果您需要一个替代方案,并且要在主要文本之后加上方括号,则可以将其添加为第二替代方案,而当前的第二替代方案将成为第三替代方案.
有点复杂,但我现在想不出更好的解决方案.
If you need an alternative with text in square brackets coming after the main text, it could be added as a second alternative and the current second one would become third.
It's a bit complicated but I can't think of a better solution right now.
如果您需要对实际匹配项进行操作,则会在捕获组中找到它们.
If you need to do something with the actual matches you will find them in the capturing groups.
更新:
说明:因此,我们在这里有两个选择:
Explanation: So, we've got two options here:
-
\ [[[^ \]] + \](?:^ | \ s)([\ w'] +)(?!\])\ b
这是说:
-
\ [[[^ \]] + \]
-匹配方括号中的所有内容(不捕获) -
(?:^ | \ s)
-后跟行首或空格-当我查看它时,请删除插入符号,因为它没有意义,所以这将变成<代码> \ s -
([[ww]] +)
-匹配以下所有单词字符,只要(?!\])
下一个字符不是右括号-很好现在可能也没有必要了,所以让我们尝试删除前瞻 -
\ b
-并匹配单词边界
\[[^\]]+\]
- match everything in square brackets (don't capture)(?:^|\s)
- followed by line start or a space - when I look at it now take the caret out as it doesn't make sense so this will become just\s
([\w']+)
- match all following word characters as long as(?!\])
the next character is not the closing bracket - well this is probably also unnecessary now, so let's try and remove the lookahead\b
- and match word boundary
2 (?:^ | \ s)([\ w'] +)(?!\])\ b
如果找不到选项1,则只进行单词匹配,而不要查找方括号,因为我们在第一部分中确保方括号不在此处.
If you cannot find the option 1 - do just the word matching, without looking for square brackets as we ensured with the first part that they are not here.
好吧,所以我删除了所有不需要的东西(它们留在那里,因为在它起作用之前我尝试了很多选择:-),而修改后的正则表达式如下:
Ok, so I removed all the things that we don't need (they stayed there because I tried quite a few options before it worked:-) and the revised regex is the one below:
\[[^\]]+\]\s([\w']+)(?!\])\b|(?:^|\s)([\w']+)\b
这篇关于Javascript正则表达式适用于不在某些字符之间的所有单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!