使用LIMIT 1索引ORDER BY [英] Indexed ORDER BY with LIMIT 1

查看:245
本文介绍了使用LIMIT 1索引ORDER BY的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取表格中的最新行。我有一个简单的时间戳 created_at ,它被编入索引。当我查询 ORDER BY created_at DESC LIMIT 1 时,它需要比我想象的要多得多(我的机器上36k行约50ms)。

I'm trying to fetch most recent row in a table. I have a simple timestamp created_at which is indexed. When I query ORDER BY created_at DESC LIMIT 1, it takes far more than I think it should (about 50ms on my machine on 36k rows).

EXPLAIN -ing声称它使用向后索引扫描,但我确认将索引更改为(created_at DESC)不会更改查询计划程序中的简单索引扫描的成本。

EXPLAIN-ing claims that it uses backwards index scan, but I confirmed that changing the index to be (created_at DESC) does not change the cost in query planner for a simple index scan.

如何优化此用例?

运行postgresql 9.2.4

Running postgresql 9.2.4.

# EXPLAIN SELECT * FROM articles ORDER BY created_at DESC LIMIT 1;
                                                  QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
Limit  (cost=0.00..0.58 rows=1 width=1752)
   ->  Index Scan Backward using index_articles_on_created_at on articles  (cost=0.00..20667.37 rows=35696 width=1752)
(2 rows)


推荐答案

假设我们正在处理大表,那么 部分索引 可能有所帮助:

Assuming we are dealing with a big table, a partial index might help:

CREATE INDEX tbl_created_recently_idx ON tbl (created_at DESC)
WHERE created_at > '2013-09-15 0:0'::timestamp;

正如您已经发现的那样:降序或升序在这里几乎不重要。 Postgres可以几乎相同的速度向后扫描(例外适用于多列索引)。

As you already found out: descending or ascending hardly matters here. Postgres can scan backwards at almost the same speed (exceptions apply with multi-column indices).

查询使用此索引:

SELECT * FROM tbl
WHERE  created_at > '2013-09-15 0:0'::timestamp -- matches index
ORDER  BY created_at DESC
LIMIT  1;

这里的要点是使指数小得多,所以应该更容易缓存和维护。

The point here is to make the index much smaller, so it should be easier to cache and maintain.


  1. 您需要选择一个保证小于最新版本的时间戳。

  2. 您应该不时重新创建索引以切断旧数据。

  3. 条件需要 IMMUTABLE

  1. You need to pick a timestamp that is guaranteed to be smaller than the most recent one.
  2. You should recreate the index from time to time to cut off old data.
  3. The condition needs to be IMMUTABLE.

因此,一次性效应随着时间的推移而恶化。 特定问题是硬编码条件:

So the one-time effect deteriorates over time. The specific problem is the hard coded condition:

WHERE created_at > '2013-09-15 0:0'::timestamp



Automate



您可以不时手动更新索引和查询。或者你在这样的函数的帮助下自动化它:

Automate

You could update the index and your queries manually from time to time. Or you automate it with the help of a function like this one:

CREATE OR REPLACE FUNCTION f_min_ts()
  RETURNS timestamp LANGUAGE sql IMMUTABLE AS
$$SELECT '2013-09-15 0:0'::timestamp$$

索引:

CREATE INDEX tbl_created_recently_idx ON tbl (created_at DESC);
WHERE created_at > f_min_ts();

查询:

SELECT * FROM tbl
WHERE  created_at > f_min_ts()
ORDER  BY created_at DESC
LIMIT  1;

使用cron作业或某些基于触发器的事件自动进行娱乐。您的查询现在可以保持不变。但是你需要在更改之后以任何方式使用此函数重新创建所有索引。只需删除并创建每一个。

Automate recreation with a cron job or some trigger-based event. Your queries can stay the same now. But you need to recreate all indices using this function in any way after changing it. Just drop and create each one.

...测试你是否真的打了用这个瓶颈。

... test whether you are actually hitting the bottle neck with this.

尝试一个简单的 DROP索引......; CREATE index ... 完成工作。那么你的索引可能已经臃肿了。您的autovacuum设置可能已关闭。

Try whether a simple DROP index ... ; CREATE index ... does the job. Then your index might have been bloated. Your autovacuum settings may be off.

或尝试 VACUUM FULL ANALYZE 以获取整个表格以及原始条件下的索引并再次检查。

Or try VACUUM FULL ANALYZE to get your whole table plus indices in pristine condition and check again.

其他选项包括通常的一般性能调整覆盖索引,具体取决于您从表中实际检索的内容。

Other options include the usual general performance tuning and covering indexes, depending on what you actually retrieve from the table.

这篇关于使用LIMIT 1索引ORDER BY的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆