如何查询文本以找到 SQL 中最长的前缀字符串? [英] How to query text to find the longest prefix strings in SQL?

查看:25
本文介绍了如何查询文本以找到 SQL 中最长的前缀字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 sparq sql.假设这是我的大桌子的快照:

ups 商店ups奥斯汀店芝加哥ups商店伯尔尼ups商店沃尔玛目标

如何在sql中找到上述数据的最长前缀?即:

 ups store沃尔玛目标

我已经有一个 Java 程序来执行此操作,但我有一个大文件,现在我的问题是这是否可以在 SQL 中合理地完成?

下面更复杂的场景怎么样?(我可以没有这个,但如果可能的话,我很高兴拥有它)

ups store austin芝加哥ups商店伯尔尼ups商店沃尔玛目标

这将返回 [ups store, walmart, target].

解决方案

假设您可以自由创建另一个表,该表只包含一个从零到最长可能字符串大小的升序整数列表,那么以下应该执行仅使用 ANSI SQL 的作业:

SELECTID,SUBSTRING(name, 1, CASE WHEN number = 0 THEN LENGTH(name) ELSE number END) AS前缀从-- 将所有位置连接到所有可能的子串长度.(选择 *从地方 p交叉连接长度 l) subq-- 如果数字为零,则在其他地方找不到前缀匹配- (从问题看起来你想包括这些)WHERE (subq.number = 0 或-- 在别处寻找前缀匹配存在 (SELECT * FROM 地方 pWHERE SUBSTRING(p.name FROM 1 FOR subq.number)= SUBSTRING(subq.name FROM 1 FOR subq.number)和 p.id <>subq.id))-- 如果正在使用整个字符串,则包括作为前缀匹配AND (subq.number = LENGTH(name)-- 不要在前缀中包含尾随空格或 (SUBSTRING(subq.name, subq.number, 1) <> ' '-- 只包含最长的前缀匹配AND NOT EXISTS (SELECT * FROM Places pWHERE SUBSTRING(p.name FROM 1 FOR subq.number + 1)= SUBSTRING(subq.name FROM 1 FOR subq.number + 1)和 p.id <>subq.id)))按 id 排序;

现场演示: http://rextester.com/XPNRP24390<块引用>

第二个方面是如果我们有 (ups store austin, ups store芝加哥).我们可以使用 SQL 从中提取ups store"吗.

这应该是一个简单的使用 SUBSTRING 的例子,与上面类似,例如:

SELECT SUBSTRING(name,LENGTH('ups store ') + 1,LENGTH(name) - LENGTH('ups store '))从地点WHERE SUBSTRING(名称,1、LENGTH('ups store ')) = 'ups store ';

I am using sparq sql. Let's say this is a snapshot of my big table:

ups store
ups store austin
ups store chicago
ups store bern
walmart
target

How can I find the longest prefix for the above data in sql? That is:

 ups store
 walmart
 target

I already have a Java program to do this but I have a large file, now my question is if this could be reasonably done in SQL?

How about the following more complicated scnenario? (I can live without this but nice to have it if possible)

ups store austin
ups store chicago
ups store bern
walmart
target

and that would return [ups store, walmart, target].

解决方案

Assuming you're free to create another table that simply has a list of ascending integers from zero up to the size of the longest possible string then the following should do the job using only ANSI SQL:

SELECT
  id,
  SUBSTRING(name, 1, CASE WHEN number = 0 THEN LENGTH(name) ELSE number END) AS prefix
FROM
 -- Join all places to all possible substring lengths.
 (SELECT *
  FROM places p
  CROSS JOIN lengths l) subq
-- If number is zero then no prefix match was found elsewhere
-- (from the question it looked like you wanted to include these)
WHERE (subq.number = 0 OR
       -- Look for prefix match elsewhere
       EXISTS (SELECT * FROM places p
               WHERE SUBSTRING(p.name FROM 1 FOR subq.number)
                     = SUBSTRING(subq.name FROM 1 FOR subq.number)
                 AND p.id <> subq.id))
  -- Include as a prefix match if the whole string is being used
  AND (subq.number = LENGTH(name)
       -- Don't include trailing spaces in a prefix
       OR (SUBSTRING(subq.name, subq.number, 1) <> ' '
           -- Only include the longest prefix match 
           AND NOT EXISTS (SELECT * FROM places p 
                           WHERE SUBSTRING(p.name FROM 1 FOR subq.number + 1)
                                 = SUBSTRING(subq.name FROM 1 FOR subq.number + 1)
                             AND p.id <> subq.id)))
ORDER BY id;

Live demo: http://rextester.com/XPNRP24390

The second aspect is that what if we have (ups store austin, ups store chicago). can we use SQL to extract the 'ups store' off of it.

This should be simply a case of using SUBSTRING in a similar way to above, e.g:

SELECT SUBSTRING(name,
                 LENGTH('ups store ') + 1,
                 LENGTH(name) - LENGTH('ups store '))
FROM places
WHERE SUBSTRING(name,
                1,
                LENGTH('ups store ')) = 'ups store ';

这篇关于如何查询文本以找到 SQL 中最长的前缀字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆