如何查询文本以查找SQL中最长的前缀字符串? [英] How to query text to find the longest prefix strings in SQL?

查看:117
本文介绍了如何查询文本以查找SQL中最长的前缀字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用sparq sql.假设这是我的大表的快照:

I am using sparq sql. Let's say this is a snapshot of my big table:

ups store
ups store austin
ups store chicago
ups store bern
walmart
target

如何在sql中找到上述数据的最长前缀?那就是:

How can I find the longest prefix for the above data in sql? That is:

 ups store
 walmart
 target

我已经有一个Java程序来执行此操作,但是我有一个大文件,现在我的问题是 是否可以在SQL中合理地完成此操作?

I already have a Java program to do this but I have a large file, now my question is if this could be reasonably done in SQL?

接下来的更复杂的程序怎么样? (我可以没有这个,但是如果可能的话,我很高兴)

How about the following more complicated scnenario? (I can live without this but nice to have it if possible)

ups store austin
ups store chicago
ups store bern
walmart
target

,它将返回[ups store, walmart, target].

推荐答案

假设您可以自由创建另一个表,该表仅包含一个从0到最长字符串的升序整数列表,则应执行以下操作仅使用ANSI SQL的工作:

Assuming you're free to create another table that simply has a list of ascending integers from zero up to the size of the longest possible string then the following should do the job using only ANSI SQL:

SELECT
  id,
  SUBSTRING(name, 1, CASE WHEN number = 0 THEN LENGTH(name) ELSE number END) AS prefix
FROM
 -- Join all places to all possible substring lengths.
 (SELECT *
  FROM places p
  CROSS JOIN lengths l) subq
-- If number is zero then no prefix match was found elsewhere
-- (from the question it looked like you wanted to include these)
WHERE (subq.number = 0 OR
       -- Look for prefix match elsewhere
       EXISTS (SELECT * FROM places p
               WHERE SUBSTRING(p.name FROM 1 FOR subq.number)
                     = SUBSTRING(subq.name FROM 1 FOR subq.number)
                 AND p.id <> subq.id))
  -- Include as a prefix match if the whole string is being used
  AND (subq.number = LENGTH(name)
       -- Don't include trailing spaces in a prefix
       OR (SUBSTRING(subq.name, subq.number, 1) <> ' '
           -- Only include the longest prefix match 
           AND NOT EXISTS (SELECT * FROM places p 
                           WHERE SUBSTRING(p.name FROM 1 FOR subq.number + 1)
                                 = SUBSTRING(subq.name FROM 1 FOR subq.number + 1)
                             AND p.id <> subq.id)))
ORDER BY id;

实时演示: http://rextester.com/XPNRP24390

第二个方面是,如果我们有(ups store奥斯汀,ups store 芝加哥).我们可以使用SQL从中提取"ups存储库"吗?

The second aspect is that what if we have (ups store austin, ups store chicago). can we use SQL to extract the 'ups store' off of it.

这应该仅仅是使用SUBSTRING的情况,类似于上面的方式,例如:

This should be simply a case of using SUBSTRING in a similar way to above, e.g:

SELECT SUBSTRING(name,
                 LENGTH('ups store ') + 1,
                 LENGTH(name) - LENGTH('ups store '))
FROM places
WHERE SUBSTRING(name,
                1,
                LENGTH('ups store ')) = 'ups store ';

这篇关于如何查询文本以查找SQL中最长的前缀字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆