SQL 替换函数中的正则表达式模式? [英] Regex pattern inside SQL Replace function?

查看:52
本文介绍了SQL 替换函数中的正则表达式模式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SELECT REPLACE('<strong>100</strong><b>.00 GB', '%^(^-?\d*\.{0,1}\d+$)%', '');

我想用上面的正则表达式替换数字两部分之间的任何标记,但它似乎不起作用.我不确定它是否是正则表达式语法错误,因为我尝试了更简单的语法,例如 '%[^0-9]%' 只是为了测试,但它也不起作用.有谁知道我怎样才能做到这一点?

I want to replace any markup between two parts of the number with above regex, but it does not seem to work. I'm not sure if it is regex syntax that's wrong because I tried simpler one such as '%[^0-9]%' just to test but it didn't work either. Does anyone know how can I achieve this?

推荐答案

您可以使用 PATINDEX找到模式(字符串)出现的第一个索引.然后使用 STUFF 将另一个字符串填充到匹配的模式(字符串)中.

You can use PATINDEX to find the first index of the pattern (string's) occurrence. Then use STUFF to stuff another string into the pattern(string) matched.

遍历每一行.用你想要的替换每个非法字符.在您的情况下,用空白替换非数字.内循环是如果当前单元格中有多个非法字符,则该循环的单元格.

Loop through each row. Replace each illegal characters with what you want. In your case replace non numeric with blank. The inner loop is if you have more than one illegal character in a current cell that of the loop.

DECLARE @counter int

SET @counter = 0

WHILE(@counter < (SELECT MAX(ID_COLUMN) FROM Table))
BEGIN  

    WHILE 1 = 1
    BEGIN
        DECLARE @RetVal varchar(50)

        SET @RetVal =  (SELECT Column = STUFF(Column, PATINDEX('%[^0-9.]%', Column),1, '')
        FROM Table
        WHERE ID_COLUMN = @counter)

        IF(@RetVal IS NOT NULL)       
          UPDATE Table SET
          Column = @RetVal
          WHERE ID_COLUMN = @counter
        ELSE
            break
    END

    SET @counter = @counter + 1
END

注意:虽然这很慢!拥有 varchar 列可能会产生影响.所以使用 LTRIM RTRIM 可能会有所帮助.无论如何,它很慢.

Caution: This is slow though! Having a varchar column may impact. So using LTRIM RTRIM may help a bit. Regardless, it is slow.

归功于这个 StackOverFlow 答案.

Credit goes to this StackOverFlow answer.

编辑也归功于@srutzky

EDIT Credit also goes to @srutzky

编辑(@Tmdean)这个答案可以适用于更基于集合的解决方案,而不是一次做一行.它仍然迭代了单行中非数字字符数的最大值,因此并不理想,但我认为在大多数情况下应该是可以接受的.

Edit (by @Tmdean) Instead of doing one row at a time, this answer can be adapted to a more set-based solution. It still iterates the max of the number of non-numeric characters in a single row, so it's not ideal, but I think it should be acceptable in most situations.

WHILE 1 = 1 BEGIN
    WITH q AS
        (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
        FROM Table)
    UPDATE Table
    SET Column = STUFF(Column, q.n, 1, '')
    FROM q
    WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

    IF @@ROWCOUNT = 0 BREAK;
END;

如果在表中保留一个表示该字段是否已被清理的位列,也可以大大提高效率.(NULL 在我的示例中代表未知",应该是列默认值.)

You can also improve efficiency quite a lot if you maintain a bit column in the table that indicates whether the field has been scrubbed yet. (NULL represents "Unknown" in my example and should be the column default.)

DECLARE @done bit = 0;
WHILE @done = 0 BEGIN
    WITH q AS
        (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n
        FROM Table
        WHERE COALESCE(Scrubbed_Column, 0) = 0)
    UPDATE Table
    SET Column = STUFF(Column, q.n, 1, ''),
        Scrubbed_Column = 0
    FROM q
    WHERE Table.ID_Column = q.ID_Column AND q.n != 0;

    IF @@ROWCOUNT = 0 SET @done = 1;

    -- if Scrubbed_Column is still NULL, then the PATINDEX
    -- must have given 0
    UPDATE table
    SET Scrubbed_Column = CASE
        WHEN Scrubbed_Column IS NULL THEN 1
        ELSE NULLIF(Scrubbed_Column, 0)
    END;
END;

如果您不想更改架构,这很容易适应将中间结果存储在表值变量中,该变量最终应用于实际表.

If you don't want to change your schema, this is easy to adapt to store intermediate results in a table valued variable which gets applied to the actual table at the end.

这篇关于SQL 替换函数中的正则表达式模式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆