如何在MSSQL 2008 R2中将CSV从一列分割到新表中的行 [英] How to split CSVs from one column to rows in a new table in MSSQL 2008 R2

查看:181
本文介绍了如何在MSSQL 2008 R2中将CSV从一列分割到新表中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设在MSSQL2008R2中的以下(非常糟糕的)表设计:

 表帖子:
| Id(PK,int)| DatasourceId(PK,int)| QuotedPostIds(nvarchar(255))| [...] 1 | 1 | | [...] 2 | 1 | 1 | [...] 2 | 2 | 1 | [...]
[...] 102322 | 2 | 123; 45345; 4356; 76757 [...]

因此,QuotedPostIds列包含一个分号分隔的自引用PostIds (孩子们,不要在家里做!)。由于这个设计是丑陋的,我想把QuotedPostIds表中的值提取到一个新的n:m关系表中,如下所示:

 所需的新表QuotedPosts:
| QuotingPostId(int)| QuotedPostId(int)| DatasourceId(int)|
| 2 | 1 | 1 |
| 2 | 1 | 2 |
[...]
| 102322 | 123 | 2 |
| 102322 | 45345 | 2 |
| 102322 | 4356 | 2 |
| 102322 | 76757 | 2 |

此表的主键可以是QuotingPostId,QuotedPostId和DatasourceID的组合,



值得注意的是,当前的Posts表包含大约6,300,000行,但只有大约285,000个行在QuotedPostIds列中设置了一个值。因此,预先过滤这些行可能是个好主意。在任何情况下,如果可能,我想只使用内部MSSQL功能执行规范化。



我已经阅读关于这个主题的其他帖子,主要涉及分割函数,但我也不知道如何准确地创建新表,并从数据源列,也不会如何过滤要相应触摸的行。



谢谢!



€dit:问题使用外部C#程序,而不是内部MSSQL功能。

解决方案

从评论中看出,这个问题可以通过Mikael Eriksson的建议来完成你说你有一个字符串拆分函数,你不知道如何使用表。



答案是使用交叉应用这样的。

 选择P.Id,
S.Value
从帖子作为P
交叉应用dbo.Split(';',P.QuotedPostIds)作为S


Imagine the following (very bad) table design in MSSQL2008R2:

Table "Posts":
| Id (PK, int) | DatasourceId (PK, int) | QuotedPostIds (nvarchar(255)) | [...] 
|     1        |        1               |                               | [...] 
|     2        |        1               |           1                   | [...] 
|     2        |        2               |           1                   | [...] 
[...]
|   102322     |        2               |     123;45345;4356;76757      | [...] 

So, the column QuotedPostIds contains a semicolon-separated list of self-referencing PostIds (Kids, don't do that at home!). Since this design is ugly as a hell, I'd like to extract the values from the QuotedPostIds table to a new n:m relationship table like this:

Desired new table "QuotedPosts":
| QuotingPostId (int) | QuotedPostId (int) | DatasourceId (int) |
|        2            |        1           |         1          |
|        2            |        1           |         2          |
[...]
|      102322         |       123          |         2          |
|      102322         |      45345         |         2          |
|      102322         |       4356         |         2          |
|      102322         |      76757         |         2          |

The primary key for this table could either be a combination of QuotingPostId, QuotedPostId and DatasourceID or an additional artificial key generated by the database.

It is worth noticing that the current Posts table contains about 6,300,000 rows but only about 285,000 of those have a value set in the QuotedPostIds column. Therefore, it might be a good idea to pre-filter those rows. In any case, I'd like to perform the normalization using internal MSSQL functionality only, if possible.

I already read other posts regarding this topic which mostly dealt with split functions but neither could I find out how exactly to create the new table and also copying the appropriate value from the Datasource column, nor how to filter the rows to touch accordingly.

Thank you!

€dit: I thought it through and finally solved the problem using an external C# program instead of internal MSSQL functionality. Since it seems that it could have been done using Mikael Eriksson's suggestion, I will mark his post as an answer.

解决方案

From comments you say you have a string split function that you you don't know how to use with a table.

The answer is to use cross apply something like this.

select P.Id,
       S.Value
from Posts as P
  cross apply dbo.Split(';', P.QuotedPostIds) as S

这篇关于如何在MSSQL 2008 R2中将CSV从一列分割到新表中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆