MySQL获取两个#/多对#之间的字符串 [英] MySQL get string(s) between two # / multiple pairs of #

查看:90
本文介绍了MySQL获取两个#/多对#之间的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何查找两个#或多对#之间的字符串.

How can I find string between two # or multiple pairs of #.

要搜索的示例文本: 这是重要的",需要进一步阐述.记得回家之前先买#牛奶.

An example text to search: This is #important# and needs to elaborated further. Remember to buy #milk before coming home#.

我希望结果是:

重要

回家之前要喝牛奶

推荐答案

编辑1

create table t91
(   id int auto_increment primary key,
    thing varchar(1000) not null
);
insert t91(thing) values
('This is #important# and needs to elaborated further. Remember to buy #milk before coming home#'),
('This is #important# and needs to elaborated further. Remember to buy #milk home#'),
('This is #');


select id,thing,theCount
from
(   SELECT id,thing,
        ROUND (   
            (
                LENGTH(thing)
                - LENGTH( REPLACE ( thing, "#", "") ) 
            ) / LENGTH("#")        
        ) AS theCount    
    FROM t91
) d
where d.theCount>1
-- 2 rows returned (id 1 and 2)

上面的修改使用了从此答案

-- truncate table t91;
create table t91
(   id int auto_increment primary key,
    thing varchar(1000) not null
);
insert t91(thing) values
('This is #important# and needs to elaborated further. Remember to buy #milk before coming home#'),
('This is #important# and needs to elaborated further. Remember to buy #milk home#'),
('This is #'),
('This is #1# ... #2# ... #');

功能:

DROP FUNCTION IF EXISTS findInsideHashMarks;
DELIMITER $$
CREATE FUNCTION findInsideHashMarks(s VARCHAR(200),segments INT)
RETURNS VARCHAR(200)
BEGIN
    DECLARE i,nextPos,i1,i2 INT;
    DECLARE sOut VARCHAR(200);
    SET i=0;
    SET nextPos=1;
    SET sOut='';
    WHILE i<segments DO
        -- LOCATE(substr,str,pos)
        SET i1=LOCATE('#',s,nextPos);
        IF i1>0 THEN
            SET i1=i1+1;
            SET nextPos=i1+1;
            SET i2=LOCATE('#',s,nextPos);
            IF i2>0 THEN
                SET nextPos=i2+1;
                SET i2=i2-1;
                SET sOut=CONCAT(sOut,SUBSTRING(s,i1,i2-i1+1));
            END IF;
        END IF;
        SET i=i+1;
        IF i<segments THEN
            SET sOut=CONCAT(sOut,',');
        END IF;
    END WHILE;
    RETURN sOut;
END$$
DELIMITER ;

查询:

SELECT id,
SUBSTRING_INDEX(SUBSTRING_INDEX(segString, ',', n.digit+1), ',', -1) dummy
FROM
(   select id,thing,theCount,cast(floor(theCount / 2) as unsigned) as segments,
    findInsideHashMarks(thing,cast(floor(theCount / 2) as unsigned)) as segString
    FROM
    (   SELECT id,thing,
            ROUND (   
                (   LENGTH(thing)
                    - LENGTH( REPLACE ( thing, "#", "") ) 
                ) / LENGTH("#")        
            ) AS theCount    
        FROM t91
    ) d1
    where d1.theCount>1
) d2
INNER JOIN
(SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3) n
ON LENGTH(REPLACE(d2.segString, ',' , '')) <= LENGTH(d2.segString)-n.digit
ORDER BY d2.id,n.digit;

输出:

+----+-------------------------+
| id | dummy                   |
+----+-------------------------+
|  1 | important               |
|  1 | milk before coming home |
|  2 | important               |
|  2 | milk home               |
|  4 | 1                       |
|  4 | 2                       |
+----+-------------------------+

此行 Answer 启发(低估)了行(a,b,c)的数字突破. fthiella .正如来自fthiella的Answer中指出的那样(或未明确表示),可以扩展UNION策略以在#标记之间拾取10个或更多的数据块.

The digits breakout of a set (a,b,c) to rows was inspired (understatement) by this Answer from User fthiella. As noted in that Answer from fthiella (or not expressly so), the UNION strategy could be expanded to pick up, say, 10 or more chunks of data between # markers.

还要注意Edit1底部的先前属性.

Also note the prior attribution at the bottom of Edit1.

Edit2的功能和查询的分支,但使用永久性的Helper表来避免派生表n的查询中的UNION.它最多支持100个#段.使用先前的(Edit2)t91表和函数.

An offshoot off of Edit2's function and query, but using a permanent Helper Table to avoid the UNION in the query for derived table n. It supports up to 100 # segments. Uses the prior (Edit2) t91 table and the Function.

模式:

CREATE TABLE 4kTable
(   -- a helper table of about 4k consecutive ints
    id int auto_increment primary key,
    thing int null
)engine=MyISAM;

insert 4kTable (thing) values (null),(null),(null),(null),(null),(null),(null),(null),(null);
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
insert 4kTable (thing) select thing from 4kTable;
-- verify:
-- select min(id),max(id),count(*) from 4kTable;
-- 1 4608 4608

ALTER TABLE 4kTable ENGINE = InnoDB; -- *********** it is now InnoDB

-- verify:
-- select min(id),max(id),count(*) from 4kTable;
-- 1 4608 4608
-- no innodb auto_increment gaps (consecutive block)

查询:

SELECT id, 
SUBSTRING_INDEX(SUBSTRING_INDEX(segString, ',', n.digit+1), ',', -1) dummy 
FROM 
(   select d1.id,thing,theCount,cast(floor(theCount / 2) as unsigned) as segments, 
    findInsideHashMarks(thing,cast(floor(theCount / 2) as unsigned)) as segString 
    FROM 
    (   SELECT id,thing, 
            ROUND (   
                (   LENGTH(thing) 
                    - LENGTH( REPLACE ( thing, "#", "") )  
                ) / LENGTH("#")         
            ) AS theCount     
        FROM t91 
    ) d1 
    where d1.theCount>1 
) d2 
INNER JOIN
(select id-1 as digit from 4kTable where id<=100) n
ON LENGTH(REPLACE(d2.segString, ',' , '')) <= LENGTH(d2.segString)-n.digit
ORDER BY d2.id,n.digit;

与上面的Edit2 Output部分相同的输出.我通常在系统中有一些永久性的辅助表,尤其是对于带有日期的left joins.

Same output as above Edit2 Output section. I usually have a few permanent Helper Tables in a system, especially for left joins with dates.

这篇关于MySQL获取两个#/多对#之间的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆