SQL连接多行 [英] SQL Concatenate multiple rows
问题描述
我正在使用Teradata,我有一个像这样的表
I'm using Teradata, I have a table like this
ID String
123 Jim
123 John
123 Jane
321 Jill
321 Janine
321 Johan
我想查询表,以便得到
ID String
123 Jim, John, Jane
321 Jill, Janine, Johan
我尝试了分区,但可以有很多名称。
如何获得此结果。甚至,为我指明正确的方向也很棒。
I tried partition but there can be many names. How do I get this result. Even, to point me in the right direction would be great.
推荐答案
很遗憾,Teradata中没有PIVOT(在14.10中只有TD_UNPIVOT)。
Unfortunately there's no PIVOT in Teradata (only a TD_UNPIVOT in 14.10).
如果运气好的话,您的站点上会有一个总计的UDF来进行群组合并(可能可能性很小)。
If you got luck there's an aggregate UDF at your site to do a group concat (probably low possibility).
否则,有两个选项:递归或聚合。
Otherwise there are two options: recursion or aggregation.
如果已知每个id的最大行数,聚合通常会更快。它有很多代码,但是大多数都是基于剪切和粘贴的。
If the maximum number of rows per id is known aggregation is normally faster. It's a lot of code, but most of it is based on cut&paste.
SELECT
id,
MAX(CASE WHEN rn = 1 THEN string END)
|| MAX(CASE WHEN rn = 2 THEN ',' || string ELSE '' END)
|| MAX(CASE WHEN rn = 3 THEN ',' || string ELSE '' END)
|| MAX(CASE WHEN rn = 4 THEN ',' || string ELSE '' END)
|| ... -- repeat up to the known maximum
FROM
(
SELECT
id, string,
ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY string) AS rn
FROM t
) AS dt
GROUP BY 1;
对于大型表,当您首先在易失表中实现派生表的结果时,效率更高
For large tables it's much more efficient when you materialize the result of the Derived Table in a Volatile Table first using the GROUP BY column as PI.
对于递归,您也应该使用挥发表,因为在递归部分不允许使用OLAP函数。相反,使用视图会重复计算OLAP函数,从而导致性能下降。
For recursion you should use a Volatile Table, too, as OLAP functions are not allowed in the recursive part. Using a view instead will repeatedly calculate the OLAP function and thus result in bad performance.
CREATE VOLATILE TABLE vt AS
(
SELECT
id
,string
,ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY string DESC) AS rn -- reverse order!
,COUNT(*)
OVER (PARTITION BY id) AS cnt
FROM t
) WITH DATA
UNIQUE PRIMARY INDEX(id, rn)
ON COMMIT PRESERVE ROWS;
WITH RECURSIVE cte
(id, list, rn) AS
(
SELECT
id
,CAST(string AS VARCHAR(1000)) -- define maximum size based on maximum number of rows
,rn
FROM vt
WHERE rn = cnt
UNION ALL
SELECT
vt.id
,cte.list || ',' || vt.string
,vt.rn
FROM vt
JOIN cte
ON vt.id = cte.id
AND vt.rn = cte.rn - 1
)
SELECT id, list
FROM cte
WHERE rn = 1;
这种方法有一个问题,它可能需要很多线轴,当您看到时,很容易看到省略 WHERE rn = 1
。
There's one problem with this approach, it might need a lot of spool which is easy to see when you omit theWHERE rn = 1
.
这篇关于SQL连接多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!