使用n:m-relation复制数据集 [英] Copy datasets with n:m-relation

查看：74 发布时间：2020/5/5 13:12:02 mysql sql postgresql insert many-to-many

本文介绍了使用n:m-relation复制数据集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用单个SQL语句

I would like to use the single SQL-statement

insert into T (...) select ... from T where ...

复制大量数据集.我的问题是从表T到其他表有N:M关系，并且这些关系也必须被复制.如果我不知道哪个原始数据集属于哪个复制的数据集，该怎么办?让我以身作则.

to copy a lot of datasets. My problem is that there are N:M-relationships from table T to other tables and these have to be copied too. How can I do this, if I do not know which original dataset belongs to which copied dataset? Let me demonstrate by example.

表T:

ID  | COL1 | COL2    
-----------------
1   | A    | B
2   | C    | D

N:M表从表T引用表U(未显示表U):

N:M-table references table U from table T (table U is not shown):

T   | U              
---------
1   | 100
1   | 101
2   | 100
2   | 102

我的复制操作，其中[???]是我不知道的部分:

My copy operation where [???] is the part I do not know:

insert into T (COL1, COL2) select COL1, COL2 from T
insert into NM (T, U) select [???]

之后的数据库内容:

表T:

ID  | COL1 | COL2
-----------------
1   | A    | B
2   | C    | D
3   | A    | B
4   | C    | D

N:M表:

T   | U
---------
1   | 100
1   | 101
2   | 100
2   | 102
3   | 100
3   | 101
4   | 100
4   | 102

注意:

我有数千个数据集(不只是两个)
我想使用'insert ... select'以获得更好的性能

推荐答案

如果幸运的是，您可以运行当前的 PostgreSQL 9.1 ，则可以通过单个命令找到一种简洁，快速的解决方案使用新的 修改数据的CTE .

If you are lucky enough to run the current PostgreSQL 9.1, there is an elegant and fast solution with a single command using the new data-modifying CTEs.

不支持

No such luck with MySQL which does not support Common Table Expressions (CTE), not to mention data-modifying CTE.

假设(col1, col2)最初是唯一的:

在这种情况下，您可以轻松地从表中选择任意切片.
不会浪费t.id的序列号.

You can easily pick arbitrary slices from the table in this case.
No sequence numbers for t.id will be wasted.

WITH s AS (
    SELECT id, col1, col2
    FROM   t
--  WHERE  some condition
    )
    ,i AS (
    INSERT INTO t (col1, col2)
    SELECT col1, col2   -- I gather from comments that id is a serial column
    FROM   s
    RETURNING id, col1, col2
    )
INSERT INTO tu (t, u)
SELECT i.id, tu.u
FROM   tu
JOIN   s ON tu.t = s.id
JOIN   i USING (col1, col2);

如果(col1, col2) 不是唯一，我会看到另外两种方式:

If (col1, col2) is not unique, I see two other ways:

使用窗口功能row_number() 非唯一行唯一.
INSERT行在t.id空间中没有孔.

Use the window function row_number() to make non-unique rows unique.
INSERT rows without holes in the t.id space just like in the query above.

WITH s AS (
    SELECT id, col1, col2
         , row_number() OVER (PARTITION BY col1, col2) AS rn
    FROM   t
--  WHERE some condition
    )
    ,i AS (
    INSERT INTO t (col1, col2)
    SELECT col1, col2
    FROM   s
    RETURNING id, col1, col2
    )
    ,r AS (
    SELECT *
         , row_number() OVER (PARTITION BY col1, col2) AS rn
    FROM   i
    )
INSERT INTO tu (t, u)
SELECT r.id, tu.u
FROM   r
JOIN   s USING (col1, col2, rn)    -- match exactly one id per row
JOIN   tu ON tu.t = s.id;

查询3

这基于@ypercube已经提供的相同思想，但是全部在一个查询中.
如果当前t.id的数字空间中有孔，则会相应地为新行刻录序列号.
别忘了重设序列，使其超过新的最大值，否则，t中的新插入物会重复出现键错误，这些错误会从序列中提取id的默认值.我将此作为命令的最后一步进行了集成.最快的这样最安全.

Query 3

This is based on the same idea that @ypercube already supplied, but all in one query.
If there are holes in numbers space for current t.id, sequence numbers will be burnt for the new rows accordingly.
Don't forget to reset your sequence beyond the new maximum or you will get duplicate key errors for new inserts in t that draw the default for id from the sequence. I integrated this as final step into the command. Fastest & safest this way.

WITH s AS (
    SELECT max(id) AS max_id
    FROM   t
    )
    ,i AS (
    INSERT INTO t (id, col1, col2)
    SELECT id + s.max_id, col1, col2
    FROM   t, s
    )
    ,j AS (
    INSERT INTO tu (t, u)
    SELECT tu.t + s.max_id, tu.u
    FROM   tu, s
    )
SELECT setval('t_id_seq', s.max_id + s.max_id)
FROM   s;

手册中有关 setval()的详细信息.

Details about setval() in the manual.

进行快速测试.

CREATE TEMP TABLE t (id serial primary key, col1 text, col2 text);
INSERT INTO t (col1, col2) VALUES 
 ('A', 'B')
,('C', 'D');

CREATE TEMP TABLE tu (t int, u int);
INSERT INTO tu VALUES
 (1, 100)
,(1, 101)
,(2, 100)
,(2, 102);

SELECT * FROM t;
SELECT * FROM tu;

最近有一个类似的问题，我在其中提供了一个类似的答案.以及版本 8.3 的替代版本，其中没有CTE和窗口功能.

There was a somewhat similar question recently, where I provided a somewhat similar answer. Plus alternatives for version 8.3 without CTEs and window functions.

这篇关于使用n:m-relation复制数据集的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用n:m-relation复制数据集 [英] Copy datasets with n:m-relation

问题描述

之后的数据库内容:

推荐答案

查询3

Query 3

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

使用n:m-relation复制数据集 [英] Copy datasets with n:m-relation

问题描述

之后的数据库内容:

推荐答案

查询3

Query 3

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭