提高PostgreSQL中的OFFSET性能 [英] Improving OFFSET performance in PostgreSQL

查看：2730 发布时间：2017/3/16 21:19:46 database postgresql query-optimization

本文介绍了提高PostgreSQL中的OFFSET性能的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在ORDER BY列上添加索引，可以使用一个ORDER BY巨大的性能差异（当与小LIMIT结合使用时）。在500,000行表上，我看到一个10000倍的改进增加了索引，只要有一个小的LIMIT。

但是，该索引对高OFFSET没有影响（即我的分页中的后续页面）。这是可以理解的：b-tree索引很容易从头开始依次迭代，但不能找到第n个项。

看起来有什么帮助计数b-tree索引，但我不知道在PostgreSQL中支持这些。还有其他解决方案吗？似乎优化大型OFFSET（特别是在分页使用情况下）并不罕见。

不幸的是，PostgreSQL手册简单地说OFFSET

解决方案

您可能需要一个计算索引。

让我们创建一个表：

 日期，金额实际）;

并填入一些随机内容：

  insert into sales 
 select current_date + sa as day，random（）* 100 as amount 
 from generate_series（1,20）;

按天索引，这里没有什么特别的：

 在销售（天）上创建索引sales_by_day;

创建行位置函数。还有其他方法，这是最简单的：

 创建或替换函数sales_pos（date）返回bigint 
 'select count（day）from sales where day <= $ 1;'
 language sql immutable;

检查它是否有效（不要在大型数据集上调用它）：

 选择sales_pos（天），天，销售金额; 
 
 sales_pos |天| amount 
 ----------- + ------------ + ---------- 
 1 | 2011-07-08 | 41.6135 
 2 | 2011-07-09 | 19.0663 
 3 | 2011-07-10 | 12.3715 
 ..................

现在棘手的部分：添加对sales_pos函数值计算的另一个索引：

 创建索引sales_by_pos使用btree （天））;

这里是如何使用它。 5是您的偏移，10是限制：

  select * from sales where sales_pos（day）> = 5和sales_pos（天） 5 + 10; 
 
 day | amount 
 ------------ + --------- 
 2011-07-12 | 94.3042 
 2011-07-13 | 12.9532 
 2011-07-14 | 74.7261 
 ...............

是快速的，因为当你这样调用它时，Postgres使用从索引预先计算的值：

 解释select * from sales 
 where sales_pos（day）> = 5 and sales_pos（day）< 5 + 10; 
 
查询计划
 ------------------------------------ -------------------------------------- 
索引使用sales_by_pos对销售进行扫描（成本= 0.50..8.77 rows = 1 width = 8）
索引条件：（（sales_pos（day）> = 5）AND（sales_pos（day）<15））

希望它有帮助。

 
I have a table I'm doing an ORDER BY on before a LIMIT and OFFSET in order to paginate.

Adding an index on the ORDER BY column makes a massive difference to performance (when used in combination with a small LIMIT). On a 500,000 row table, I saw a 10,000x improvement adding the index, as long as there was a small LIMIT.

However, the index has no impact for high OFFSETs (i.e. later pages in my pagination). This is understandable: a b-tree index makes it easy to iterate in order from the beginning but not to find the nth item.

It seems that what would help is a counted b-tree index, but I'm not aware of support for these in PostgreSQL. Is there another solution? It seems that optimizing for large OFFSETs (especially in pagination use-cases) isn't that unusual.

Unfortunately, the PostgreSQL manual simply says "The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient."
 解决方案 
You might want a computed index.

Let's create a table:
create table sales(day date, amount real);
And fill it with some random stuff:
insert into sales 
    select current_date + s.a as day, random()*100 as amount
    from generate_series(1,20);
Index it by day, nothing special here:
create index sales_by_day on sales(day);
Create a row position function. There are other approaches, this one is the simplest:
create or replace function sales_pos (date) returns bigint 
   as 'select count(day) from sales where day <= $1;' 
   language sql immutable;
Check if it works (don't call it like this on large datasets though):
select sales_pos(day), day, amount from sales;

     sales_pos |    day     |  amount  
    -----------+------------+----------
             1 | 2011-07-08 |  41.6135
             2 | 2011-07-09 |  19.0663
             3 | 2011-07-10 |  12.3715
    ..................
Now the tricky part: add another index computed on the sales_pos function values:
create index sales_by_pos on sales using btree(sales_pos(day));
Here is how you use it. 5 is your "offset", 10 is the "limit":
select * from sales where sales_pos(day) >= 5 and sales_pos(day) < 5+10;

        day     | amount  
    ------------+---------
     2011-07-12 | 94.3042
     2011-07-13 | 12.9532
     2011-07-14 | 74.7261
    ...............
It is fast, because when you call it like this, Postgres uses precalculated values from the index:
explain select * from sales 
  where sales_pos(day) >= 5 and sales_pos(day) < 5+10;

                                    QUERY PLAN                                
    --------------------------------------------------------------------------
     Index Scan using sales_by_pos on sales  (cost=0.50..8.77 rows=1 width=8)
       Index Cond: ((sales_pos(day) >= 5) AND (sales_pos(day) < 15))
Hope it helps.

                        这篇关于提高PostgreSQL中的OFFSET性能的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

提高PostgreSQL中的OFFSET性能 [英] Improving OFFSET performance in PostgreSQL

问题描述

相关文章

其他数据库最新文章

热门教程

热门工具

登录关闭

提高PostgreSQL中的OFFSET性能 [英] Improving OFFSET performance in PostgreSQL

问题描述

相关文章

其他数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭