Mysql正则表达式搜索,无重复字符 [英] Mysql Regular Expression search with no repeating characters

查看:197
本文介绍了Mysql正则表达式搜索,无重复字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据库表,其中包含字典中的单词.

I have a database table with words from a dictionary.

现在,我想为一个字谜选择单词.例如,如果我给出字符串SEPIAN,它应该获取诸如apespainpainspiespinessepia等的值.

Now I want to select words for an anagram. For example if I give the string SEPIAN it should fetch values like apes, pain, pains, pies, pines, sepia, etc.

为此,我使用了查询

SELECT * FROM words WHERE word REGEXP '^[SEPIAN]{1,6}$'

但是此查询返回诸如annaessen之类的单词,这些单词的重复字符不在提供的字符串中.例如. anna有两个n,但是搜索字符串SEPIAN中只有一个n.

But this query returns words like anna, essen which have repeated characters not in the supplied string. Eg. anna has two n's but there is only one n in the search string SEPIAN.

如何编写正则表达式来实现这一目标?另外,如果那时我的搜索字符串中有重复的字符,则重复的字符应反映在结果中.

How can I write my regular expression to achieve this? Also if there are repeated characters in my search string at that time the repeated characters should reflect in the result.

推荐答案

由于MySQL不支持向后引用捕获组,因此(\w).*\1的典型解决方案将不起作用.这意味着给出的任何解决方案都需要枚举所有可能的双精度数.此外,据我所知,反向引用在先行或先行之后无效,并且MySQL不支持先行和先行.

Since MySQL does not support back-referencing capturing groups, the typical solution of (\w).*\1 will not work. This means that any solution given will need to enumerate all possible doubles. Furthermore, as far as I can tell back-references are not valid in look-aheads or look-behinds, and look-aheads and look-behinds are not supported in MySQL.

但是,您可以将其拆分为两个表达式,并使用以下查询:

However, you can split this into two expressions, and use the following query:

SELECT * FROM words
WHERE word REGEXP '^[SEPIAN]{1,6}$'
AND NOT word REGEXP 'S.*?S|E.*?E|P.*?P|I.*?I|A.*?A|N.*?N'

不是很漂亮,但是它可以工作,并且应该也相当有效.

Not very pretty, but it works and it should be fairly efficient as well.

要支持设置的重复字符数限制,请在辅助表达式中使用以下模式:

To support a set limit of repeated characters, use the following pattern for your secondary expression:

A(.*?A){X,}

其中A是您的角色,X是允许的次数.

Where A is your character and X is the number of times it's allowed.

因此,如果您要在字符串SEPIANN中添加另一个N(总共2个N s),则查询将变为:

So if you're adding another N to your string SEPIANN (for a total of 2 Ns), your query would become:

SELECT * FROM words
WHERE word REGEXP '^[SEPIAN]{1,7}$'
AND NOT word REGEXP 'S.*?S|E.*?E|P.*?P|I.*?I|A.*?A|N(.*?N){2}'

这篇关于Mysql正则表达式搜索,无重复字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆