处理文本限定符之间出现分隔符的字符串的 SQL 拆分函数? [英] SQL Split function that handles string with delimeter appearing between text qualifiers?

查看:21
本文介绍了处理文本限定符之间出现分隔符的字符串的 SQL 拆分函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有几个 SQL 拆分函数,从循环驱动到使用 xml 命令,甚至使用数字表.我还没有找到支持文本限定符的.

There are several SQL split functions, from loop driven, to using xml commands, and even using a numbers table. I haven't found one that supports text qualifiers.

使用下面的示例字符串,我想在 "," 上拆分,但当它出现在双引号或单引号之间时则不会.

Using the example string below, I would like to split on ",", but not when it appears between double or single quotes.

示例数据:

jsmith@anywhere.com, "Sally \"Heat\" Jones" <sally@anywhere.com>, "Mark Jones" <mjones@anywhere.com>, "Stone, Ron" <rstone@anywhere.com>

应该返回一个表:

jsmith@anywhere.com
"Sally \"Heat\" Jones" <sally@anywhere.com>
"Mark Jones" <mjones@anywhere.com>
"Stone, Ron" <rstone@anywhere.com>

我知道这是一个复杂的查询/功能,但任何建议或任何指导都将不胜感激.

I know this is a complex query/function, but any suggestions or any guidance would be mucho appreciated.

推荐答案

CREATE FUNCTION [dbo].[udfSplit]
(
    @nvcString nvarchar(max),
    @nvcDelimiter nvarchar(1),
    @nvcTQ nvarchar(1)
)
RETURNS @tblTokens TABLE (
                            Token nvarchar(max)
                            )
AS
BEGIN

    DECLARE @intCounter int
    DECLARE @nvcToken nvarchar(4000)
    DECLARE @nvcCurrentChar nvarchar(1)
    DECLARE @intStart int

    IF @nvcString <> ''
        BEGIN
            SET @intCounter = 1
            SET @nvcToken = ''
            SET @intStart = 0

            --Loop through each character of the string
            WHILE @intCounter <= LEN(@nvcString)
                BEGIN
                    SET @nvcCurrentChar = SUBSTRING(@nvcString, @intCounter, 1)

                    --If current char is TQ
                    IF @nvcCurrentChar = @nvcTQ
                        BEGIN
                            --Concatonate to token
                            SET @nvcToken = @nvcToken + @nvcCurrentChar

                            --If this is the end TQ
                            IF @intStart <> 0
                                BEGIN
                                    --Fix TQ
                                    SET @nvcToken = dbo.udfRemoveTQFromToken(@nvcToken, @nvcTQ)

                                    IF @nvcToken <> ''
                                        BEGIN
                                            INSERT INTO @tblTokens (Token) VALUES (@nvcToken)
                                            SET @nvcToken = '' 
                                        END
                                    --Reset TQ
                                    SET @intStart = 0 
                                END
                            ELSE
                                BEGIN
                                    SET @nvcToken = dbo.udfRemoveTQFromToken(@nvcToken, @nvcTQ)

                                    IF @nvcToken <> ''
                                        BEGIN
                                            INSERT INTO @tblTokens (Token) VALUES (@nvcToken)
                                            SET @nvcToken = '' 
                                        END

                                    --Mark TQ start position
                                    SET @intStart = @intCounter  
                                END
                        END
                    ELSE IF @intStart = 0 AND @nvcCurrentChar = @nvcDelimiter
                        BEGIN
                            --If not inside TQ, and char is Delimiter
                            SET @nvcToken = dbo.udfRemoveTQFromToken(@nvcToken, @nvcTQ)

                            IF @nvcToken <> ''
                                BEGIN
                                    INSERT INTO @tblTokens (Token) VALUES (@nvcToken)
                                    SET @nvcToken = '' 
                                END
                        END
                    ELSE
                        BEGIN
                            --Current char is not TQ or Delim, add to current token
                            SET @nvcToken = @nvcToken + @nvcCurrentChar
                        END

                    SET @intCounter = @intCounter + 1
                END
        END

    SET @nvcToken = dbo.udfRemoveTQFromToken(@nvcToken, @nvcTQ)

    IF @nvcToken <> ''
        BEGIN
            --Current Token has not been added to table
            INSERT INTO @tblTokens (Token) VALUES (@nvcToken)
        END

    RETURN
END

GO


CREATE FUNCTION [dbo].[udfRemoveTQFromToken]
(
    @nvcToken nvarchar(4000),
    @nvcTQ nvarchar(1)
)
RETURNS nvarchar(4000) AS
BEGIN

    DECLARE @nvcReturn nvarchar(4000)

    --Trim token, needs to be done first, 
    --as we dont want to trim any spaces within the TQ
    --unless it was malformed
    SET @nvcReturn = LTRIM(RTRIM(@nvcToken))

    --If Left char is TQ
    IF LEFT(@nvcReturn, 1) = @nvcTQ
        BEGIN
            --Though both cases perform the removal of the left most char (opening TQ)
            --We need to perform a trim after removal ONLY if it was malformed
            IF RIGHT(@nvcReturn, 1) <> @nvcTQ   
                BEGIN
                    --But no matching end TQ, malformed
                    --fix by removing left most char (the opening TQ)
                    SET @nvcReturn = RIGHT(@nvcReturn, LEN(@nvcReturn) - 1)
                    --Reapply the LTRIM, incase there were spaces after the opening TQ
                    SET @nvcReturn = LTRIM(@nvcReturn)
                END
            ELSE
                BEGIN
                    --has matching end TQ, well-formed
                    --fix by removing left most char (the opening TQ)
                    SET @nvcReturn = RIGHT(@nvcReturn, LEN(@nvcReturn) - 1)
                END
        END

    --Remove the right most char (the closing TQ)
    IF RIGHT(@nvcReturn, 1) = @nvcTQ
        SET @nvcReturn = LEFT(@nvcReturn, LEN(@nvcReturn) - 1)

    RETURN @nvcReturn
END

这篇关于处理文本限定符之间出现分隔符的字符串的 SQL 拆分函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆