将 SQL 列中的分隔值拆分为多行 [英] Splitting delimited values in a SQL column into multiple rows

查看:54
本文介绍了将 SQL 列中的分隔值拆分为多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我真的很想在这里提供一些建议,以提供一些我正在将消息跟踪日志从 Exchange 2007 插入 SQL 的背景信息.由于我们每天有数百万行数据,因此我使用 Bulk Insert 语句将数据插入到 SQL 表中.

I would really like some advice here, to give some background info I am working with inserting Message Tracking logs from Exchange 2007 into SQL. As we have millions upon millions of rows per day I am using a Bulk Insert statement to insert the data into a SQL table.

事实上,我实际上是批量插入到临时表中,然后从那里我将数据合并到实时表中,这是为了测试解析问题,因为某些字段否则会有引号等值.

In fact I actually Bulk Insert into a temp table and then from there I MERGE the data into the live table, this is for test parsing issues as certain fields otherwise have quotes and such around the values.

这很有效,除了收件人地址列是一个由 ; 分隔的分隔字段.字符,有时可能会非常长,因为可能有很多电子邮件收件人.

This works well, with the exception of the fact that the recipient-address column is a delimited field seperated by a ; character, and it can be incredibly long sometimes as there can be many email recipients.

我想取这一列,并将值拆分为多行,然后将这些行插入到另一个表中.问题是我尝试的任何事情要么花费太长时间,要么无法按照我想要的方式工作.

I would like to take this column, and split the values into multiple rows which would then be inserted into another table. Problem is anything I am trying is either taking too long or not working the way I want.

以这个示例数据为例:

message-id                                              recipient-address
2D5E558D4B5A3D4F962DA5051EE364BE06CF37A3A5@Server.com   user1@domain1.com
E52F650C53A275488552FFD49F98E9A6BEA1262E@Server.com     user2@domain2.com
4fd70c47.4d600e0a.0a7b.ffff87e1@Server.com              user3@domain3.com;user4@domain4.com;user5@domain5.com

我希望在我的收件人表中按照以下格式进行格式化:

I would like this to be formatted as followed in my Recipients table:

message-id                                              recipient-address
2D5E558D4B5A3D4F962DA5051EE364BE06CF37A3A5@Server.com   user1@domain1.com
E52F650C53A275488552FFD49F98E9A6BEA1262E@Server.com     user2@domain2.com
4fd70c47.4d600e0a.0a7b.ffff87e1@Server.com              user3@domain3.com
4fd70c47.4d600e0a.0a7b.ffff87e1@Server.com              user4@domain4.com
4fd70c47.4d600e0a.0a7b.ffff87e1@Server.com              user5@domain5.com

有人对我如何去做这件事有任何想法吗?

Does anyone have any ideas about how I can go about doing this?

我非常了解 PowerShell,所以我尝试过,但是即使是 28K 记录的 foreach 循环也需要花费很长时间来处理,我需要一些能够尽可能快速/高效地运行的东西.

I know PowerShell pretty well, so I tried in that, but a foreach loop even on 28K records took forever to process, I need something that will run as quickly/efficiently as possible.

谢谢!

推荐答案

首先创建一个拆分函数:

First, create a split function:

CREATE FUNCTION dbo.SplitStrings
(
    @List       NVARCHAR(MAX),
    @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
AS
    RETURN (SELECT Number = ROW_NUMBER() OVER (ORDER BY Number),
        Item FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(@List, Number, 
        CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)))
    FROM (SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
        FROM sys.all_objects AS s1 CROSS APPLY sys.all_objects) AS n(Number)
    WHERE Number <= CONVERT(INT, LEN(@List))
        AND SUBSTRING(@Delimiter + @List, Number, 1) = @Delimiter
    ) AS y);
GO

现在您可以简单地推断:

Now you can extrapolate simply by:

SELECT s.[message-id], f.Item
  FROM dbo.SourceData AS s
  CROSS APPLY dbo.SplitStrings(s.[recipient-address], ';') as f;

另外我建议不要在列名中放置破折号.这意味着您必须始终将它们放在 [方括号] 中.

Also I suggest not putting dashes in column names. It means you always have to put them in [square brackets].

这篇关于将 SQL 列中的分隔值拆分为多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆