Mysql正则表达式搜索,无重复字符 [英] Mysql Regular Expression search with no repeating characters
问题描述
我有一个数据库表,其中包含字典中的单词.
I have a database table with words from a dictionary.
现在,我想为一个字谜选择单词.例如,如果我给出字符串SEPIAN
,它应该获取诸如apes
,pain
,pains
,pies
,pines
,sepia
等的值.
Now I want to select words for an anagram. For example if I give the string SEPIAN
it should fetch values like apes
, pain
, pains
, pies
, pines
, sepia
, etc.
为此,我使用了查询
SELECT * FROM words WHERE word REGEXP '^[SEPIAN]{1,6}$'
但是此查询返回诸如anna
,essen
之类的单词,这些单词的重复字符不在提供的字符串中.例如. anna
有两个n
,但是搜索字符串SEPIAN
中只有一个n
.
But this query returns words like anna
, essen
which have repeated characters not in the supplied string. Eg. anna
has two n
's but there is only one n
in the search string SEPIAN
.
如何编写正则表达式来实现这一目标?另外,如果那时我的搜索字符串中有重复的字符,则重复的字符应反映在结果中.
How can I write my regular expression to achieve this? Also if there are repeated characters in my search string at that time the repeated characters should reflect in the result.
推荐答案
由于MySQL不支持向后引用捕获组,因此(\w).*\1
的典型解决方案将不起作用.这意味着给出的任何解决方案都需要枚举所有可能的双精度数.此外,据我所知,反向引用在先行或先行之后无效,并且MySQL不支持先行和先行.
Since MySQL does not support back-referencing capturing groups, the typical solution of (\w).*\1
will not work. This means that any solution given will need to enumerate all possible doubles. Furthermore, as far as I can tell back-references are not valid in look-aheads or look-behinds, and look-aheads and look-behinds are not supported in MySQL.
但是,您可以将其拆分为两个表达式,并使用以下查询:
However, you can split this into two expressions, and use the following query:
SELECT * FROM words
WHERE word REGEXP '^[SEPIAN]{1,6}$'
AND NOT word REGEXP 'S.*?S|E.*?E|P.*?P|I.*?I|A.*?A|N.*?N'
不是很漂亮,但是它可以工作,并且应该也相当有效.
Not very pretty, but it works and it should be fairly efficient as well.
要支持设置的重复字符数限制,请在辅助表达式中使用以下模式:
To support a set limit of repeated characters, use the following pattern for your secondary expression:
A(.*?A){X,}
其中A
是您的角色,X
是允许的次数.
Where A
is your character and X
is the number of times it's allowed.
因此,如果您要在字符串SEPIANN
中添加另一个N
(总共2个N
s),则查询将变为:
So if you're adding another N
to your string SEPIANN
(for a total of 2 N
s), your query would become:
SELECT * FROM words
WHERE word REGEXP '^[SEPIAN]{1,7}$'
AND NOT word REGEXP 'S.*?S|E.*?E|P.*?P|I.*?I|A.*?A|N(.*?N){2}'
这篇关于Mysql正则表达式搜索,无重复字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!