使用SQL完全复制Postgres表 [英] Completely copying a postgres table with SQL

查看:102
本文介绍了使用SQL完全复制Postgres表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

免责声明::该问题类似于堆栈溢出问题此处,但是这些答案都无法解决我的问题,正如我稍后会解释。



我正在尝试复制一个大表(约4000万行,超过100列)在postgres中索引了很多列。当前,我使用以下SQL:

 创建表<表名> _copy(就像< tablename>包括全部); 
INSERT INTO< tablename> _copy SELECT * FROM< tablename> ;;

此方法有两个问题:


  1. 它会在数据提取之前添加索引,因此比创建没有索引的表要花费更多的时间,然后在复制所有数据后建立索引。

  2. 这不会不能正确复制 SERIAL样式的列。它没有在新表上设置新的计数器,而是将新表中列的默认值设置为过去表的计数器,这意味着它不会随着行的添加而增加。

表的大小使索引成为实时问题。这也使得转储到文件然后重新注册变得不可行。我也没有命令行的优势。我需要在SQL中执行此操作。



我要执行的操作是直接使用一些奇迹命令直接进行精确复制,或者如果不可能,则进行复制包含所有约束但没有索引的表,并确保它们是实质约束(又称SERIAL列的新计数器)。然后使用 SELECT * 复制所有数据,然后复制所有索引。



来源


  1. 有关数据库复制的堆栈溢出问题:这不是我要问的三个原因




    • 它使用命令行选项 pg_dump -t x2 | sed s / x2 / x3 / g | psql ,在这种设置下,我无权访问命令行

    • 它会在数据摄取前创建索引,这很慢

    • 它没有正确更新序列列作为默认nextval('x1_id_seq':: regclass)


    • 方法重置postgres表的序列值:很好,但是很遗憾,它非常手动。



解决方案

好吧,不幸的是,您将不得不手工做一些这样的事情。但这一切都可以通过psql之类的方法完成。第一个命令很简单:

  select *从旧表


这将使用旧表的数据创建新表,但不使用索引。然后,您必须自己创建索引和序列等。您可以使用以下命令获取表上所有索引的列表:

 从pg_indexes中选择indexdef,其中tablename ='oldtable' ; 

然后运行psql -E访问您的数据库并使用\d查看旧表。然后,您可以修改这两个查询以获取有关序列的信息:

  SELECT c.oid,
n。 nspname,
c.relname
从pg_catalog.pg_class c
左联接pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname〜'^(oldtable) $'
AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 2,3;

选择a.attname,
pg_catalog.format_type(a.atttypid,a.atttypmod),
(SELECT substring(pg_catalog.pg_get_expr(d.adbin,d.adrelid))为128)
从pg_catalog.pg_attrdef d
哪里d.adrelid = a.attrelid和d.adnum = a.attnum和a.atthasdef),
a.attnotnull,a.attnum
来自pg_catalog.pg_attribute a
其中a.attrelid ='74359'并且a.attnum> 0并且不是a.attisdroped
OR。

将上面的74359替换为从上一个查询中获得的oid。


DISCLAIMER: This question is similar to the stack overflow question here, but none of those answers work for my problem, as I will explain later.

I'm trying to copy a large table (~40M rows, 100+ columns) in postgres where a lot of the columns are indexed. Currently I use this bit of SQL:

CREATE TABLE <tablename>_copy (LIKE <tablename> INCLUDING ALL);
INSERT INTO <tablename>_copy SELECT * FROM <tablename>;

This method has two issues:

  1. It adds the indices before data ingest, so it will take much longer than creating the table without indices and then indexing after copying all of the data.
  2. This doesn't copy `SERIAL' style columns properly. Instead of setting up a new 'counter' on the the new table, it sets the default value of the column in the new table to the counter of the past table, meaning it won't increment as rows are added.

The table size makes indexing a real time issue. It also makes it infeasible to dump to a file to then re-ingest. I also don't have the advantage of a command line. I need to do this in SQL.

What I'd like to do is either straight make an exact copy with some miracle command, or if that's not possible, to copy the table with all contraints but without indices, and make sure they're the constraints 'in spirit' (aka a new counter for a SERIAL column). Then copy all of the data with a SELECT * and then copy over all of the indices.

Sources

  1. Stack Overflow question about database copying: This isn't what I'm asking for for three reasons

    • It uses the command line option pg_dump -t x2 | sed 's/x2/x3/g' | psql and in this setting I don't have access to the command line
    • It creates the indices pre data ingest, which is slow
    • It doesn't update the serial columns correctly as evidence by default nextval('x1_id_seq'::regclass)
  2. Method to reset the sequence value for a postgres table: This is great, but unfortunately it is very manual.

解决方案

Well, you're gonna have to do some of this stuff by hand, unfortunately. But it can all be done from something like psql. The first command is simple enough:

select * into newtable from oldtable

This will create newtable with oldtable's data but not indexes. Then you've got to create the indexes and sequences etc on your own. You can get a list of all the indexes on a table with the command:

select indexdef from pg_indexes where tablename='oldtable';

Then run psql -E to access your db and use \d to look at the old table. You can then mangle these two queries to get the info on the sequences:

SELECT c.oid,
  n.nspname,
  c.relname
FROM pg_catalog.pg_class c
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname ~ '^(oldtable)$'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 2, 3;

SELECT a.attname,
  pg_catalog.format_type(a.atttypid, a.atttypmod),
  (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
   FROM pg_catalog.pg_attrdef d
   WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
  a.attnotnull, a.attnum
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = '74359' AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;

Replace that 74359 above with the oid you get from the previous query.

这篇关于使用SQL完全复制Postgres表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆