将日期时间约束添加到PostgreSQL多列部分索引 [英] Add datetime constraint to a PostgreSQL multi-column partial index

查看:232
本文介绍了将日期时间约束添加到PostgreSQL多列部分索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为 queries_query 的PostgreSQL表,它有很多列。

I've got a PostgreSQL table called queries_query, which has many columns.

其中两列, created user_sid ,经常在我的应用程序的SQL查询中一起使用,以确定给定用户有多少查询在过去30天内完成。在最近30天之前的任何时间查询这些统计数据是非常非常罕见的。

Two of these columns, created and user_sid, are frequently used together in SQL queries by my application to determine how many queries a given user has done over the past 30 days. It is very, very rare that I query these stats for any time older than the most recent 30 days.

这是我的问题:

我目前通过运行在这两列上创建了我的多列索引:

I've currently created my multi-column index on these two columns by running:

CREATE INDEX CONCURRENTLY some_index_name ON queries_query (user_sid, created)

但是我想进一步限制索引关心创建日期在过去30天内的查询。我尝试过以下操作:

But I'd like to further restrict the index to only care about those queries in which the created date is within the past 30 days. I've tried doing the following:

CREATE INDEX CONCURRENTLY some_index_name ON queries_query (user_sid, created)
WHERE created >= NOW() - '30 days'::INTERVAL`

但这引发了一个异常,说明我的函数必须是不可变的。

But this throws an exception stating that my function must be immutable.

我很乐意让这个工作正常,以便我可以优化我的索引,并减少Postgres执行这些重复查询所需的资源。

I'd love to get this working so that I can optimize my index, and cut back on the resources Postgres needs to do these repeated queries.

推荐答案

您尝试使用 now()因为该函数不是 IMMUTABLE (显然)而且,我引用 手册此处

You get an exception in your attempt to use now() because the function is not IMMUTABLE (obviously) and, I quote the manual here:


所有功能索引定义中使用的运算符必须是不可变的......

All functions and operators used in an index definition must be "immutable" ...

我看到两种方法可以使用a(效率更高) nt)这里的部分指数:

I see two ways to utilize a (much more efficient) partial index here:

CREATE INDEX queries_recent_idx ON queries_query (user_sid, created)
WHERE created > '2013-01-07 00:00'::timestamp;

假设 已创建实际上定义为 时间戳 。为 timestamptz 列提供时间戳常量(时间戳随时间变化)是行不通的区)。从时间戳 timestamptz (反之亦然)的演员表取决于当前时区设置,并且 不可变 。使用匹配数据类型的常量。了解带/不带时区的时间戳的基础知识:

Assuming created is actually defined as timestamp. It wouldn't work to provide a timestamp constant for a timestamptz column (timestamp with time zone). The cast from timestamp to timestamptz (or vice versa) depends on the current time zone setting and is not immutable. Use a constant of matching data type. Understand the basics of timestamps with / without time zone:

  • Ignoring timezones altogether in Rails and PostgreSQL

在流量较低的小时内删除并重新创建该索引,可能每天或每周都有一个cron作业(或者对你来说足够好)。创建索引非常快,尤其是部分索引相对较小。此解决方案也不需要向表中添加任何内容。

Drop and recreate that index at hours with low traffic, maybe with a cron job on a daily or weekly basis (or whatever is good enough for you). Creating an index is pretty fast, especially a partial index that is comparatively small. This solution also doesn't need to add anything to the table.

假设 没有并发访问 到表中,自动索引重新创建可以使用如下函数完成:

Assuming no concurrent access to the table, automatic index recreation could be done with a function like this:

CREATE OR REPLACE FUNCTION f_index_recreate()
  RETURNS void AS
$func$
BEGIN
   DROP INDEX IF EXISTS queries_recent_idx;
   EXECUTE format('
      CREATE INDEX queries_recent_idx
      ON queries_query (user_sid, created)
      WHERE created > %L::timestamp'
    , LOCALTIMESTAMP - interval '30 days');  -- timestamp constant
--  , now() - interval '30 days');           -- alternative for timestamptz
END
$func$  LANGUAGE plpgsql;

致电:

SELECT f_index_recreate();

now()(和你一样)相当于 CURRENT_TIMESTAMP 并返回 timestamptz 。使用 now():: timestamp 转换为时间戳或使用 LOCALTIMESTAMP 代替。

now() (like you had) is the equivalent of CURRENT_TIMESTAMP and returns timestamptz. Cast to timestamp with now()::timestamp or use LOCALTIMESTAMP instead.

  • Select today's (since midnight) timestamps only

使用Postgres 9.2测试 - 9.4。

SQL小提琴。

Tested with Postgres 9.2 - 9.4.
SQL Fiddle.

如果您必须处理 并发访问 ,使用 CREATE INDEX CONCURRENTLY 。但是你不能把这个命令包装成一个函数因为,每个文档

If you have to deal with concurrent access, use CREATE INDEX CONCURRENTLY. But you can't wrap this command into a function because, per documentation:


...常规 CREATE INDEX 命令可以在事务
块内执行,但 CREATE INDEX CONCURRENTLY 不能。

... a regular CREATE INDEX command can be performed within a transaction block, but CREATE INDEX CONCURRENTLY cannot.

因此, 两个单独的交易

So, with two separate transactions:

CREATE INDEX CONCURRENTLY queries_recent_idx2 ON queries_query (user_sid, created)
WHERE  created > '2013-01-07 00:00'::timestamp;  -- your new condition

然后:

DROP INDEX CONCURRENTLY IF EXISTS queries_recent_idx;

可选择重命名为旧名称:

Optionally, rename to old name:

ALTER INDEX queries_recent_idx2 RENAME TO queries_recent_idx;



2。带有已存档标记条件的部分索引



在表格中添加已存档标记:

ALTER queries_query ADD COLUMN archived boolean NOT NULL DEFAULT FALSE;

更新您选择的时间间隔列退出旧行并创建如下索引:

UPDATE the column at intervals of your choosing to "retire" older rows and create an index like:

CREATE INDEX some_index_name ON queries_query (user_sid, created)
WHERE NOT archived;

为查询添加匹配条件(即使看起来多余)以允许它使用索引。检查 EXPLAIN ANALYZE 查询计划程序是否捕获 - 它应该能够在较新的日期使用索引进行查询。但它不会理解更复杂的条件不完全匹配。

Add a matching condition to your queries (even if it seems redundant) to allow it to use the index. Check with EXPLAIN ANALYZE whether the query planner catches on - it should be able to use the index for queries on an newer date. But it won't understand more complex conditions not matching exactly.

您不必删除并重新创建索引,但 UPDATE <桌子上的/ code>可能比索引娱乐更贵,桌子会稍微大一些。

You don't have to drop and recreate the index, but the UPDATE on the table may be more expensive than index recreation and the table gets slightly bigger.

我会选择第一个选项(索引娱乐)。事实上,我在几个数据库中使用此解决方案。第二个会导致更高成本的更新。

I would go with the first option (index recreation). In fact, I am using this solution in several databases. The second incurs more costly updates.

随着时间的推移,两种解决方案都会保持其有用性,因为索引中包含更多过时的行,性能会逐渐恶化。

Both solutions retain their usefulness over time, performance slowly deteriorates as more outdated rows are included in the index.

这篇关于将日期时间约束添加到PostgreSQL多列部分索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆