在PostgreSQL的``复制自''过程中忽略重复的键 [英] To ignore duplicate keys during 'copy from' in postgresql
问题描述
我必须将大量数据从文件转储到PostgreSQL表中。我知道它不支持MySql中的忽略,替换等操作。网上几乎所有与此相关的帖子都提出了相同的建议,例如将数据转储到临时表中,然后执行插入...选择...不存在的位置...。
I have to dump large amount of data from file to a table PostgreSQL. I know it does not support 'Ignore' 'replace' etc as done in MySql. Almost all posts regarding this in the web suggested the same thing like dumping the data to a temp table and then do a 'insert ... select ... where not exists...'.
在一种情况下(文件数据本身包含重复的主键)将无济于事。
任何人都有关于如何在PostgreSQL中处理此问题的想法吗?
This will not help in one case, where the file data itself contained duplicate primary keys. Any body have an idea on how to handle this in PostgreSQL?
P.S。如果有帮助,我正在使用Java程序进行操作
P.S. I am doing this from a java program, if it helps
推荐答案
使用与您描述的方法相同的方法,但删除
(或分组或修改...),然后在加载到主表之前在临时表中复制 PK
。
Use the same approach as you described, but DELETE
(or group, or modify ...) duplicate PK
in the temp table before loading to the main table.
类似的东西:
CREATE TEMP TABLE tmp_table
ON COMMIT DROP
AS
SELECT *
FROM main_table
WITH NO DATA;
COPY tmp_table FROM 'full/file/name/here';
INSERT INTO main_table
SELECT DISTINCT ON (PK_field) *
FROM tmp_table
ORDER BY (some_fields)
详细信息: 创建表AS
, COPY
, DISTINCT ON
Details: CREATE TABLE AS
, COPY
, DISTINCT ON
这篇关于在PostgreSQL的``复制自''过程中忽略重复的键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!