在列中分割字符串 [英] split string in column

查看:100
本文介绍了在列中分割字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有从分层数据库中获取的数据,并且如果原始数据库是关系数据库,它通常包含的列应包含在另一张表中的数据.

I have data that has come over from a hierarchical database, and it often has columns that contain data that SHOULD be in another table, if the original database had been relational.

该列的数据是成对格式化的,其中LABEL\VALUE以空格作为定界符,如下所示:

The column's data is formatted in pairs, with LABEL\VALUE with a space as the delimiter, like this:

LABEL1 \ VALUE标签2 \ VALUE标签3 \ VALUE

LABEL1\VALUE LABEL2\VALUE LABEL3\VALUE

在记录中很少有一对以上,但多达三对.有24种可能的标签.该表中还有其他列,包括ID.我已经能够在不使用游标的情况下将此列转换为稀疏数组,其中包含ID,LABEL1,LABEL2等列....

There is seldom more than one pair in a record, but there as many as three. There are 24 different possible Labels. There are other columns in this table, including the ID. I have been able to convert this column into a sparse array without using a cursor, with columns for ID, LABEL1, LABEL2, etc....

但这不是在其他查询中使用的理想选择.我的另一个选择是使用游标,一次遍历整个表并写入临时表,但是我看不到让它按我想要的方式工作.我已经能够在VB.NET中使用几个嵌套循环在短短几分钟内做到这一点,但是即使在使用游标的情况下,也无法在T-SQL中做到这一点.问题是,在使用它创建的表之前,我必须记住每次都要运行该程序.不理想.

But this is not ideal for using in another query. My other option it to use a cursor, loop through the entire table once and write to a temp table, but I can't see to get it to work the way I want. I have been able to do it in just a few minutes in VB.NET, using a couple of nested loops, but can't manage to do it in T-SQL even using cursors. Problem is, that I would have to remember to run this program every time before I want to use the table it creates. Not ideal.

所以,我读了一行,将'LABEL1 \ VALUE LABEL2 \ VALUE LABEL3 \ VALUE'中的对分成一个数组,然后再次将它们分开,然后写行

So, I read a row, split out the pairs from 'LABEL1\VALUE LABEL2\VALUE LABEL3\VALUE' into an array, then split them out again, then write the rows

ID,LABEL1,VALUE

ID, LABEL1, VALUE

ID,LABEL2,VALUE

ID, LABEL2, VALUE

ID,LABEL3,VALUE

ID, LABEL3, VALUE

等...

我意识到,在这里拆分"字符串是SQL要做的难事,但似乎要困难得多.我想念什么?

I realize that 'splitting' the strings here is the hard part for SQL to do, but it just seems a lot more difficult that it needs to be. What am I missing?

推荐答案

假定数据标签不包含.字符,则可以为此使用一个简单的函数:

Assuming that the data label contains no . characters, you can use a simple function for this:

CREATE FUNCTION [dbo].[SplitGriswold]
(
  @List   NVARCHAR(MAX),
  @Delim1 NCHAR(1),
  @Delim2 NCHAR(1)
)
RETURNS TABLE
AS
  RETURN
  ( 
    SELECT 
      Val1 = PARSENAME(Value,2),
      Val2 = PARSENAME(Value,1)
    FROM 
    (
      SELECT REPLACE(Value, @Delim2, '.') FROM
      ( 
        SELECT LTRIM(RTRIM(SUBSTRING(@List, [Number],
          CHARINDEX(@Delim1, @List + @Delim1, [Number]) - [Number])))
        FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
          FROM sys.all_objects) AS x
          WHERE Number <= LEN(@List)
          AND SUBSTRING(@Delim1 + @List, [Number], LEN(@Delim1)) = @Delim1
       ) AS y(Value)
     ) AS z(Value)
   );
GO

样品用量:

DECLARE @x TABLE(ID INT, string VARCHAR(255));

INSERT @x VALUES
  (1, 'LABEL1\VALUE LABEL2\VALUE LABEL3\VALUE'),
  (2, 'LABEL1\VALUE2 LABEL2\VALUE2');

SELECT x.ID, t.val1, t.val2
FROM @x AS x CROSS APPLY 
 dbo.SplitGriswold(REPLACE(x.string, ' ', N'ŏ'), N'ŏ', '\') AS t;

(我使用了Unicode字符,该字符不太可能出现在上面的数据中,只是因为空格可能会引起长度检查之类的问题.如果可能出现此字符,请选择其他字符.)

(I used a Unicode character unlikely to appear in data above, only because a space can be problematic for things like length checks. If this character is likely to appear, choose a different one.)

结果:

ID   val1       val2
--   --------   --------
1    LABEL1     VALUE
1    LABEL2     VALUE
1    LABEL3     VALUE
2    LABEL1     VALUE2
2    LABEL2     VALUE2

如果您的数据可能具有.,则只需在数据中添加不太可能或不可能出现的另一个字符,即可在不更改功能的情况下使查询稍微复杂一些:

If your data might have ., then you can just make the query a little more complex, without changing the function, by adding yet another character to the mix that is unlikely or impossible to be in the data:

DECLARE @x TABLE(ID INT, string VARCHAR(255));

INSERT @x VALUES
(1, 'LABEL1\VALUE.A LABEL2\VALUE.B LABEL3\VALUE.C'),
(2, 'LABEL1\VALUE2.A LABEL2.1\VALUE2.B');

SELECT x.ID, val1 = REPLACE(t.val1, N'ű', '.'), val2 = REPLACE(t.val2, N'ű', '.')
FROM @x AS x CROSS APPLY 
  dbo.SplitGriswold(REPLACE(REPLACE(x.string, ' ', 'ŏ'), '.', N'ű'), 'ŏ', '\') AS t;

结果:

ID   val1       val2
--   --------   --------
1    LABEL1     VALUE.A
1    LABEL2     VALUE.B
1    LABEL3     VALUE.C
2    LABEL1     VALUE2.A
2    LABEL2.1   VALUE2.B

这篇关于在列中分割字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆