使用大表上的OFFSET优化查询 [英] Optimize query with OFFSET on large table

查看：107 发布时间：2020/5/23 20:25:22 sql postgresql pagination sql-order-by postgresql-9.5

本文介绍了使用大表上的OFFSET优化查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有桌子

create table big_table (
id serial primary key,
-- other columns here
vote int
);

此表非常大，大约有7,000万行，我需要查询:

This table is very big, approximately 70 million rows, I need to query:

SELECT * FROM big_table
ORDER BY vote [ASC|DESC], id [ASC|DESC]
OFFSET x LIMIT n  -- I need this for pagination

您可能知道，当x很大时，这样的查询非常慢.

As you may know, when x is a large number, queries like this are very slow.

为了优化性能，我添加了索引:

For performance optimization I added indexes:

create index vote_order_asc on big_table (vote asc, id asc);

和

create index vote_order_desc on big_table (vote desc, id desc);

EXPLAIN表明上面的SELECT查询使用这些索引，但是无论如何，偏移很大都会很慢.

EXPLAIN shows that the above SELECT query uses these indexes, but it's very slow anyway with a large offset.

我该怎么做才能在大表中使用OFFSET优化查询?也许PostgreSQL 9.5或更高版本具有某些功能?我已经搜索了，但是没有找到任何东西.

What can I do to optimize queries with OFFSET in big tables? Maybe PostgreSQL 9.5 or even newer versions have some features? I've searched but didn't find anything.

推荐答案

大型OFFSET总是很慢. Postgres必须对所有行进行排序，并计算 visible 直到您的偏移量为止.要直接跳过所有先前的行，可以向表中添加索引的row_number(或创建

A large OFFSET is always going to be slow. Postgres has to order all rows and count the visible ones up to your offset. To skip all previous rows directly you could add an indexed row_number to the table (or create a MATERIALIZED VIEW including said row_number) and work with WHERE row_number > x instead of OFFSET x.

但是，这种方法仅对只读(或大部分)数据有意义.为可以同时更改的表数据实现相同的操作更具挑战性.您需要先<精确地定义所需的行为.

However, this approach is only sensible for read-only (or mostly) data. Implementing the same for table data that can change concurrently is more challenging. You need to start by defining desired behavior exactly.

我建议使用不同的方法分页:

I suggest a different approach for pagination:

SELECT *
FROM   big_table
WHERE  (vote, id) > (vote_x, id_x)  -- ROW values
ORDER  BY vote, id  -- needs to be deterministic
LIMIT  n;

vote_x和id_x来自 上一页 的最后行(对于DESC和).或从 first (如果向后导航 ).

Where vote_x and id_x are from the last row of the previous page (for both DESC and ASC). Or from the first if navigating backwards.

已有的索引支持比较行值-一种符合ISO SQL标准的功能，但并非每个RDBMS都支持它.

Comparing row values is supported by the index you already have - a feature that complies with the ISO SQL standard, but not every RDBMS supports it.

CREATE INDEX vote_order_asc ON big_table (vote, id);

或按降序排列:

SELECT *
FROM   big_table
WHERE  (vote, id) < (vote_x, id_x)  -- ROW values
ORDER  BY vote DESC, id DESC
LIMIT  n;

可以使用相同的索引.
我建议您声明列NOT NULL或熟悉NULLS FIRST|LAST构造:

Can use the same index.
I suggest you declare your columns NOT NULL or acquaint yourself with the NULLS FIRST|LAST construct:

PostgreSQL按日期时间asc排序，是否为null?

请注意两件事:

WHERE子句中的ROW值不能用分隔的成员字段替换. WHERE (vote, id) > (vote_x, id_x) 不能替换为:

The ROW values in the WHERE clause cannot be replaced with separated member fields. WHERE (vote, id) > (vote_x, id_x) cannot be replaced with:

WHERE  vote >= vote_x
AND    id   > id_x

这将排除所有行和id <= id_x行，而我们只希望对同一票进行表决，而不是对下一票进行表决.正确的翻译应该是:

That would rule out all rows with id <= id_x, while we only want to do that for the same vote and not for the next. The correct translation would be:

WHERE (vote = vote_x AND id > id_x) OR vote > vote_x

...不能很好地与索引配合使用，并且随着列的增加而变得越来越复杂.

... which doesn't play along with indexes as nicely, and gets increasingly complicated for more columns.

很简单.那是我一开始提到的特殊情况.

Would be simple for a single column, obviously. That's the special case I mentioned at the outset.

该技术不适用于ORDER BY中的混合方向，例如:

The technique does not work for mixed directions in ORDER BY like:

ORDER  BY vote ASC, id DESC

至少我想不出一种通用的方法来有效地实现这一目标.如果两列中至少有一列是数字类型，则可以在(vote, (id * -1))上使用带倒数值的函数索引，并在ORDER BY中使用相同的表达式:

At least I can't think of a generic way to implement this as efficiently. If at least one of both columns is a numeric type, you could use a functional index with an inverted value on (vote, (id * -1)) - and use the same expression in ORDER BY:

ORDER  BY vote ASC, (id * -1) ASC

使用大表上的OFFSET优化查询 [英] Optimize query with OFFSET on large table

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用大表上的OFFSET优化查询 [英] Optimize query with OFFSET on large table

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭