如何尽可能透明地将现有的Postgres表迁移到分区表? [英] How to migrate an existing Postgres Table to partitioned table as transparently as possible?

查看:113
本文介绍了如何尽可能透明地将现有的Postgres表迁移到分区表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在postgres-DB中有一个现有表。为了演示起见,它是这样的:

I have an existing table in a postgres-DB. For the sake of demonstration, this is how it looks like:

create table myTable(
    forDate date not null,
    key2 int not null,
    value int not null,
    primary key (forDate, key2)
);

insert into myTable (forDate, key2, value) values
    ('2000-01-01', 1, 1),
    ('2000-01-01', 2, 1),
    ('2000-01-15', 1, 3),
    ('2000-03-02', 1, 19),
    ('2000-03-30', 15, 8),
    ('2011-12-15', 1, 11);

但是与这几个值相反, myTable 实际上是巨大的,并且还在不断增长。我正在从该表中生成各种报告,但是目前我98%的报告使用一个月,其余查询的工作时间甚至更短。通常,我的查询导致Postgres对这个巨大的表进行表扫描,我正在寻找减少问题的方法。 表分区似乎完全适合我的问题。我可以将桌子分成几个月。但是,如何将现有表转换为分区表?手册明确指出:

However in contrast to these few values, myTable is actually HUGE and it is growing continuously. I am generating various reports from this table, but currently 98% of my reports work with a single month and the remaining queries work with an even shorter timeframe. Oftentimes my queries cause Postgres to do table scans over this huge table and I am looking for ways to reduce the problem. Table partitioning seems to fit my problem perfectly. I could just partition my table into months. But how do I turn my existing table into a partitioned table? The manual explicitly states:


不可能将常规表转换为分区表,反之亦然

It is not possible to turn a regular table into a partitioned table or vice versa

所以我需要开发自己的迁移脚本,该脚本将分析当前表并将其迁移。需求如下:

So I need to develop my own migration script, which will analyze the current table and migrate it. The needs are as follows:


  • 在设计时 myTable 涵盖的时间范围未知。

  • 每个分区从该月的第一天到该月的最后一天应覆盖一个月。

  • 表将增长无限期地,因此对于要生成多少张表,我没有理智的停止值

  • 结果应尽可能透明,这意味着我想尽可能少地接触现有数据码。最好的情况下,感觉就像是一张普通表,我可以在其中插入或选择它们而没有任何特殊之处。

  • 可接受的数据库停机迁移时间

  • 高度推荐与纯Postgres一起使用,而无需在服务器上安装任何插件或其他东西。

  • 数据库是PostgreSQL 10,无论如何迟早都将升级到新版本,因此如果有帮助,这是一个选择

  • At design time the time frame which myTable covers is unknown.
  • Each partition should cover one month from the first day of that month to the last day of that month.
  • The table will grow indefinitely, so I have no sane "stop value" for how many tables to generate
  • The result should be as transparent as possible, meaning that I want to touch as little as possible of my existing code. In best case this feels like a normal table which I can insert to and select from without any specials.
  • A database downtime for migration is acceptable
  • Getting along with pure Postgres without any plugins or other things that need to be installed on the server is highly preferred.
  • Database is PostgreSQL 10, upgrading to a newer version will happen sooner or later anyway, so this is an option if it helps

如何迁移要分区的表?

推荐答案

在Postgres 10中引入了声明性分区,它可以减轻您的工作量,例如生成触发器或规则与巨大的if / else语句重定向到正确的表。 Postgres现在可以自动执行此操作。让我们从迁移开始:

In Postgres 10 "Declarative Partitioning" was introduced, which can relieve you of a good deal of work such as generating triggers or rules with huge if/else statements redirecting to the correct table. Postgres can do this automatically now. Let's start with the migration:


  1. 重命名旧表并创建新的分区表

  1. Rename the old table and create a new partitioned table

alter table myTable rename to myTable_old;

create table myTable_master(
    forDate date not null,
    key2 int not null,
    value int not null
) partition by range (forDate);


这几乎不需要任何解释。重命名旧表(在数据迁移后,我们将其删除),并为分区获得一个主表,该表与原始表基本相同,但没有索引)

This should hardly require any explanation. The old table is renamed (after data migration we'll delete it) and we get a master table for our partition which is basically the same as our original table, but without indexes)


  1. 创建一个可以根据需要生成新分区的函数:

  1. Create a function that can generate new partitions as we need them:

create function createPartitionIfNotExists(forDate date) returns void
as $body$
declare monthStart date := date_trunc('month', forDate);
    declare monthEndExclusive date := monthStart + interval '1 month';
    -- We infer the name of the table from the date that it should contain
    -- E.g. a date in June 2005 should be int the table mytable_200506:
    declare tableName text := 'mytable_' || to_char(forDate, 'YYYYmm');
begin
    -- Check if the table we need for the supplied date exists.
    -- If it does not exist...:
    if to_regclass(tableName) is null then
        -- Generate a new table that acts as a partition for mytable:
        execute format('create table %I partition of myTable_master for values from (%L) to (%L)', tableName, monthStart, monthEndExclusive);
        -- Unfortunatelly Postgres forces us to define index for each table individually:
        execute format('create unique index on %I (forDate, key2)', tableName);
    end if;
end;
$body$ language plpgsql;


稍后会派上用场。


  1. 创建一个基本上只委托给我们主表的视图:

  1. Create a view that basically just delegates to our master table:

create or replace view myTable as select * from myTable_master;


  • 创建规则,以便在插入规则时不仅仅更新分区表,还可以根据需要创建一个新分区:

  • Create rule so that when we insert into the rule, we'll not just update out partitioned table, but also create a new partition if needed:

    create or replace rule autoCall_createPartitionIfNotExists as on insert
        to myTable
        do instead (
            select createPartitionIfNotExists(NEW.forDate);
            insert into myTable_master (forDate, key2, value) values (NEW.forDate, NEW.key2, NEW.value)
        );
    


  • 当然,如果您还需要 update delete ,您还需要一个简单明了的规则。

    Of course, if you also need update and delete, you also need a rule for those which should be straight forward.


    1. 实际上是迁移旧表:

    1. Actually migrate the old table:

    -- Finally copy the data to our new partitioned table
    insert into myTable (forDate, key2, value) select * from myTable_old;
    
    -- And get rid of the old table
    drop table myTable_old;
    


    现在,表迁移完成了无需知道需要多少个分区,并且视图 myTable 绝对透明。您可以像以前一样简单地从表中插入并选择,但是从分区中可以获得性能上的好处。

    Now migration of the table is complete without that there was any need to know how many partitions are needed and also the view myTable will be absolutely transparent. You can simple insert and select from that table as before, but you might get the performance benefit from partitioning.

    请注意,仅需要视图,因为分区表不能有行触发器。如果您可以随时从代码中手动调用 createPartitionIfNotExists ,那么您就不需要该视图及其所有规则。在这种情况下,您需要在迁移期间手动添加分区als:

    Note that the view is only needed, because a partitioned table cannot have row triggers. If you can get along with calling createPartitionIfNotExists manually whenever needed from your code, you do not need the view and all it's rules. In this case you need to add the partitions als manually during migration:

    do
    $$
    declare rec record;
    begin
        -- Loop through all months that exist so far...
        for rec in select distinct date_trunc('month', forDate)::date yearmonth from myTable_old loop
            -- ... and create a partition for them
            perform createPartitionIfNotExists(rec.yearmonth);
        end loop;
    end
    $$;
    

    这篇关于如何尽可能透明地将现有的Postgres表迁移到分区表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆