使用正则表达式由 4 个不同字母组成的单词? [英] Word made up of exactly 4 different letters using regular expressions?

查看:71
本文介绍了使用正则表达式由 4 个不同字母组成的单词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如重新评估将匹配.它完全包含 4 个不同的字符:'r'、'e'、'a' 和 's'.

For example reassesses will match. It contains exactly 4 different characters: 'r', 'e', 'a' and 's'.

我的尝试是:/^([az])([az])([az])([az])(\1|\2|\3|\4)(\1|\2|\3|\4)(\1|\2|\3|\4)$/(添加尽可能多的 (\1|\2|\3|\4) 以匹配单词的长度)

My attempt is: /^([a-z])([a-z])([a-z])([a-z])(\1|\2|\3|\4)(\1|\2|\3|\4)(\1|\2|\3|\4)$/ (add as many (\1|\2|\3|\4) as required to match length of word)

但是,这最多只能匹配 4 个不同的字母,并且仅当它们是前 4 个字符时.

However this will match only up to 4 different letters, and only if they are the first 4 characters.

有没有更好的解决方案?

Is there any better solution?

推荐答案

是这样的:

^([a-z])\1*+([a-z])(?:\1|\2)*+([a-z])(?:\1|\2|\3)*+([a-z])(?:\1|\2|\3|\4)*$

占有量词的使用在这种模式中是必不可少的,因为它禁止回溯和避免以下捕获组与已找到的字母匹配.

The use of possessive quantifiers is essential in this pattern, because it forbids backtracking and avoids that the following capturing group matches a letter that has been found.

Java 中提供了所有格量词功能(不要忘记双重转义反向引用),但是如果您需要在没有此功能的语言中使用该模式,您可以找到几个选项来翻译"我评论中的模式.

The possessive quantifier feature is available in Java (don't forget to double escape backreferences), but if you need to use the pattern in a language that doesn't have this feature, you can find several options to "translate" the pattern in my comment.

上面的模式是为了检查整个字符串而构建的,但是如果你想在更大的字符串中查找单词,你可以使用这个(最终不区分大小写的选项):

The above pattern is build to check a whole string, but if you want to find words in a larger string, you can use this (with eventually the case-insensitive option):

(?<![a-z])([a-z])\1*+([a-z])(?:\1|\2)*+([a-z])(?:\1|\2|\3)*+([a-z])(?:\1|\2|\3|\4)*(?![a-z])

这篇关于使用正则表达式由 4 个不同字母组成的单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆