PostgreSQL-查看还是分区? [英] PostgreSQL - View or Partitioning?

查看:784
本文介绍了PostgreSQL-查看还是分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试重新设计Pg数据库以获得更高的性能。 Db用于ERP IS,具有较大的日期期限(四年)。每年都是在单独的数据库中,这是一个糟糕的解决方案(建立报告是一件痛苦的事?),所以我将所有四个数据库整合到一个数据库中……但是……有些表太大了!为了获得一些性能,我决定将数据划分为表格。我有2种方法可以做到这一点。

I am trying to redesigning Pg database to gain more performance. Db is for ERP IS and it holds larger amount of date (four years). Every year was in separate database, which was a bad solution (building reports was pain in the a??), so I consolidated all four db's into one... but... some tables are just to large! In order to gain some performance I decided to divide data in tables. I have 2 ways to do this.

首先:将表分为 arch_table和 working_table,并使用视图进行报告。

First: dividing tables into "arch_table" and "working_table" and using views for reporting.

第二:使用分区(例如每年使用单独的分区)。

Second: using partitioning (say separate partition for every year).

所以,我的问题是更好吗?分区或某些归档系统?

So, my question is which way is better ? Partitioning or some archiving system ?

推荐答案

PostgreSQL的分区实际上是一堆使用检查约束来验证仅每个分区中都有正确的数据。创建一个父表并创建从主表继承的其他分区:

PostgreSQL's partitioning is, effectively, a bunch of views that use a check constraint to verify that only correct data is in each partition. A parent table is created and additional partitions are created that inherit from the master:

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
);

CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
...
CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);

很显然,我省略了一些代码,但是您可以查看 PostgreSQL表分区。分区的最重要部分是确保您构建自动脚本以在将来创建新分区以及合并旧分区。

Obviously, I've omitted a bit of code, but you can check out the documentation on PostgreSQL table partitioning. The most important part of partitioning is to make sure you build automatic scripts to create new partitions into the future as well as merge old partitions.

操作上,当PostgreSQL运行时您的查询将显示为 SELECT *从测量'2006-02-13'和'2006-02-22'之间的记录日期; 优化器会显示啊哈!我知道这里有一个分区。我只看表 measurement_y2006m02 并拉回适当的数据。

Operationally, when PostgreSQL goes to run your query it looks at SELECT * FROM measurement WHERE logdate BETWEEN '2006-02-13' AND '2006-02-22'; the optimizer goes "AH HA! I know what's up here, there's a partition. I'll just look at table measurement_y2006m02 and pull back the appropriate data."

随着数据从主分区中老化,您可以只删除旧表,也可以将它们合并到归档分区中。这项工作中的大部分工作都可以通过编写脚本来实现自动化-您真正需要做的就是编写一次脚本并对其进行测试。附带的好处是,较旧的数据往往不会更改-许多分区将不需要索引维护或清理。

As you age data out of the main partitions, you can either just drop the old tables or else merge them into an archive partition. Much of this work can be automated through scripting - all you really need to do is write the scripts once and test it. A side benefit is that older data tends to not change - many partitions will require no index maintenance or vacuuming.

请记住,分区主要是一种数据管理解决方案,并且可能不能提供您想要的性能优势。调整查询,应用索引以及检查PostgreSQL配置(postgresql.conf,存储配置和OS配置)可能会带来更大的性能提升,从而对数据进行分区。

Keep in mind that partitioning is largely a data management solution and may not provide the performance benefit that you're looking for. Tuning queries, applying indexes, and examining the PostgreSQL configuration (postgresql.conf, storage configuration, and OS configuration) may lead to far bigger performance gains that partitioning your data.

这篇关于PostgreSQL-查看还是分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆