使用正则表达式由 4 个不同字母组成的单词? [英] Word made up of exactly 4 different letters using regular expressions?
问题描述
例如重新评估将匹配.它完全包含 4 个不同的字符:'r'、'e'、'a' 和 's'.
For example reassesses will match. It contains exactly 4 different characters: 'r', 'e', 'a' and 's'.
我的尝试是:/^([az])([az])([az])([az])(\1|\2|\3|\4)(\1|\2|\3|\4)(\1|\2|\3|\4)$/
(添加尽可能多的 (\1|\2|\3|\4)
以匹配单词的长度)
My attempt is: /^([a-z])([a-z])([a-z])([a-z])(\1|\2|\3|\4)(\1|\2|\3|\4)(\1|\2|\3|\4)$/
(add as many (\1|\2|\3|\4)
as required to match length of word)
但是,这最多只能匹配 4 个不同的字母,并且仅当它们是前 4 个字符时.
However this will match only up to 4 different letters, and only if they are the first 4 characters.
有没有更好的解决方案?
Is there any better solution?
推荐答案
是这样的:
^([a-z])\1*+([a-z])(?:\1|\2)*+([a-z])(?:\1|\2|\3)*+([a-z])(?:\1|\2|\3|\4)*$
占有量词的使用在这种模式中是必不可少的,因为它禁止回溯和避免以下捕获组与已找到的字母匹配.
The use of possessive quantifiers is essential in this pattern, because it forbids backtracking and avoids that the following capturing group matches a letter that has been found.
Java 中提供了所有格量词功能(不要忘记双重转义反向引用),但是如果您需要在没有此功能的语言中使用该模式,您可以找到几个选项来翻译"我评论中的模式.
The possessive quantifier feature is available in Java (don't forget to double escape backreferences), but if you need to use the pattern in a language that doesn't have this feature, you can find several options to "translate" the pattern in my comment.
上面的模式是为了检查整个字符串而构建的,但是如果你想在更大的字符串中查找单词,你可以使用这个(最终不区分大小写的选项):
The above pattern is build to check a whole string, but if you want to find words in a larger string, you can use this (with eventually the case-insensitive option):
(?<![a-z])([a-z])\1*+([a-z])(?:\1|\2)*+([a-z])(?:\1|\2|\3)*+([a-z])(?:\1|\2|\3|\4)*(?![a-z])
这篇关于使用正则表达式由 4 个不同字母组成的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!