MySQL选择匹配字符串的最大长度 [英] MySQL select maximum length of matching string

查看:455
本文介绍了MySQL选择匹配字符串的最大长度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要返回所有文本结果,如果有的话,它们共享搜索字符串共有的最大长度左边界子字符串.

I need to return all the text results(s), if any, that share the maximum length left bounded substring common to the search string.

在包含

"Stack",
"Sta", 
"StackOv", 
"StackOverthrow",
"StackOverSlow",
"StackFlow", 
"Soverflow",
"StackOverCrow",
"StackOverSlow",
etc. 

查询将返回"StackOverthrow",因为它包含最大数量的匹配字符,以及唯一结果集中的StackOverSlow和StackOverCrow. 目前,我的工作效率低下,这是从对第一个字符的搜索开始,然后继续重复和扩展搜索字符串,直到找不到任何内容为止,并保持最后的良好结果.

the query would return "StackOverthrow" as it contains the greatest number of matching characters, as well as StackOverSlow and StackOverCrow in a unique result set. Currently Im doing something inefficient which is to start with a LIKE search for the first characters and to continue repeating and extending the search string until nothing is found, and keeping the last good result.

select names from table where name like 'XX%';


 "S" ->Results
 "St"->Results
 . .
 "StackOver"->Results 
 "StackOverf"-> No results (Last result returning items beginning with StackOver etc  as being the correct answer)

我知道这种方法效率极低,任何人都可以提供一个查询来实现此结果吗?我知道我可以一次搜索所有组合,然后过滤代码中最长的结果,但是,我认为数据库应该会更好.

I know that this approach is extremely inefficient, can anyone provide a single query to achieve this result? I know I could search for all combinations at once and filter for the longest results in code, however, I think the DB should be better at this.

Edit1:请注意,上面的示例有些简化.数据库中的绝大多数数据在2到10个字符之间,最常见的匹配长度约为3个字符.该表中有多达10万条记录.

Note the example above is somewhat of a simplification. The vast majority of data in the DB is between 2 and 10 chars, with the most common match length of about 3 chars. There are upwards of 100K records in the table.

Edit2:抱歉,我需要澄清,可能有多个正确的结果,并且结果可能包含需要删除的重复项.目前,由于我效率低下的方法,选择独特"很容易.

Apologies, I needed to clarify that there may be more than one correct result, and that the results can contain duplicates that need to be removed. Currently with my inefficient method selecting distinct is easy.

推荐答案

name上具有索引时,以下内容应表现出色:

With an index on name, the following ought to be extremely performant:

SELECT DISTINCT name
FROM   myTable
WHERE  name LIKE CASE
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'S%') THEN '%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'St%') THEN 'S%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'Sta%') THEN 'St%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'Stac%') THEN 'Sta%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'Stack%') THEN 'Stac%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackO%') THEN 'Stack%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOv%') THEN 'StackO%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOve%') THEN 'StackOv%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOver%') THEN 'StackOve%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverf%') THEN 'StackOver%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverfl%') THEN 'StackOverf%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverflo%') THEN 'StackOverfl%'
  WHEN NOT EXISTS(SELECT * FROM myTable WHERE name LIKE 'StackOverflow%') THEN 'StackOverflo%'
  ELSE 'StackOverflow%'
END

sqlfiddle 上查看.

这篇关于MySQL选择匹配字符串的最大长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆