示例查询以显示PostgreSQL中的基数估计错误 [英] Sample Query to show Cardinality estimation error in PostgreSQL
问题描述
我正在使用PostgreSQL9.3进行项目。我正在使用以下查询来显示选择性估计错误如何导致使用PostgreSQL8.3在TPC-H工作负载上的查询执行时间成倍增加。
I am working on a project using PostgreSQL9.3. I was using the below query to show how selectivity estimation errors can lead to multi-fold increase in query execution time on TPC-H workload using PostgreSQL8.3.
select
n_name,
sum(l_extendedprice * (1 - l_discount)) as revenue
from
customer,
orders,
lineitem,
supplier,
nation,
region
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and l_suppkey = s_suppkey
and c_nationkey = s_nationkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and (r_name='ASIA' or r_name='AFRICA')
and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1 year'
and l_shipdate <= l_receiptdate
and l_commitdate <= l_shipdate + integer '90'
and l_extendedprice <= 20000
and c_name like '%r#00%'
and c_acctbal <=2400
group by
n_name
order by
revenue desc
问题在于PostgreSQL8.3选择的计划涉及很多NestedLoop联接,因为对行项和客户的选择性估计是大错特错。我认为这主要是由于LIKE模式匹配。
The problem was that PostgreSQL8.3 was choosing a plan which involves lot of NestedLoop joins since selectivity estimation on lineitem and customer were wrong by big margin. I think this was majorly due to LIKE pattern matching. But the optimal plan should have been using Hash Joins.
最近,我为Project升级到PostgreSQL9.3,并观察到上面的查询不再给出错误的计划。我花了一些时间尝试查找对TPC-H 1GB数据具有大基数估计错误的查询,但到目前为止没有成功。是否有任何PostgreSQL极客知道TPC-H基准上的现成查询或任何显示PostgreSQL9.3中基数估计错误的查询?
Recently I upgraded to PostgreSQL9.3 for my Project and observed that the above query no longer gives a bad plan. I spent some amount of time trying to find a query with large cardinality estimation error on TPC-H 1GB data with no success till now. Does any PostgreSQL geeks know some off the shelf query on TPC-H benchmark or any query to show cardinality estimation error in PostgreSQL9.3
推荐答案
这是为了回答由@发表的评论第十二以及问题本身。
This is to answer the comment by @Twelfth as well as the question itself.
显式内部联接语法(
INNER JOIN
,CROSS JOIN
或未经修饰的JOIN
)
在语义上是相同的在FROM
中列出了输入关系,因此
不会限制连接顺序。
Explicit inner join syntax (
INNER JOIN
,CROSS JOIN
, or unadornedJOIN
) is semantically the same as listing the input relations inFROM
, so it does not constrain the join order.
...
强制计划者遵循以下列出的加入顺序显式
JOIN
s,将join_collapse_limit
运行时参数设置为1。(其他
可能的值在下面讨论。)
To force the planner to follow the join order laid out by explicit
JOIN
s, set thejoin_collapse_limit
run-time parameter to 1. (Other possible values are discussed below.)
...
以这种方式约束规划者的搜索是一种有用的技术
,它既可以减少规划时间,又可以将规划者定向到
好的查询计划。
加粗强调。相反,出于测试目的,您可以滥用相同的内容将查询计划器定向到坏查询计划。阅读整个手册页。
Bold emphasis mine. Conversely, you can abuse the same to direct the query planner to a bad query plan for your testing purposes. Read the whole manual page. It should be instrumental.
此外,您还可以通过停用替代方法,一个接一个地禁用(仅在您的会话中最佳)。像这样:
Also, you can force nested loops by disabling alternative methods one by one (best in your session only). Like:
SET enable_hashjoin = off;
等等。
关于检查和设置参数:
Etc.
About checking and setting parameters:
- Query a parameter (postgresql.conf setting) like "max_connections"
一种明显的方法是禁用自动清理,然后从表中添加/删除行。然后,查询计划器将使用过时的统计信息。请注意,其他一些命令也会更新统计信息。
One obvious way would be to disable autovacuum and add / remove rows from the table. Then the query planner is working with outdated statistics. Note that some other commands update statistics as well.
统计信息存储在目录表 pg_class
和<$ c中$ c> pg_statistics 。
Statistics are stored in the catalog tables pg_class
and pg_statistics
.
SELECT * FROM pg_class WHERE oid = 'mytable'::regclass;
SELECT * FROM pg_statistic WHERE starelid = 'mytable'::regclass;
这使我想到了另一个选择。您可以在这两个表中伪造条目。需要超级用户特权。
您不会以新手的身份来打我,但要向公众发出警告:如果破坏目录表中的内容,数据库(集群)可能会瘫痪。您已被警告。
This leads me to another option. You could forge entries in these two tables. Superuser privileges required.
You don't strike me as a newcomer, but a warning for the general public: If you break something in the catalog tables, your database (cluster) might go belly-up. You have been warned.
这篇关于示例查询以显示PostgreSQL中的基数估计错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!