在Postgres中加快缓慢SELECT DISTINCT查询的解决方案 [英] Solution for speeding up a slow SELECT DISTINCT query in Postgres

查看:1408
本文介绍了在Postgres中加快缓慢SELECT DISTINCT查询的解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

查询基本上是:

  SELECT DISTINCTmy_table。foofrommy_tableWHERE ... 

假设我100%确定 DISTINCT 部分的查询是它运行缓慢的原因,我省略了查询的其余部分,以避免混淆,因为它是不同部分的慢度,我主要关注(distinct总是一个慢的来源。)。 / p>

有问题的表有250万行数据。 DISTINCT 需要用于此处未列出的用途(因为我不想返回已修改的查询,而只是关于进行不同查询的一般信息



如何使更快地在 DBMS DISTINCT 运行速度更快(特别是使用Postgres 9),而不改变SQL(即,我不能改变这个SQL进来,但有访问在数据库级别优化)?

解决方案

您的DISTINCT会导致它对输出行进行排序,以便找到重复的行。如果您对查询选择的列放置索引,则数据库可能能够按索引顺序读取它们,并保存排序步骤。很多将取决于查询的详细信息和涉及的表 - 你说你知道问题是与DISTINCT真正限制了可用的答案的范围。


The query is basically:

SELECT DISTINCT "my_table"."foo" from "my_table" WHERE...

Pretending that I'm 100% certain the DISTINCT portion of the query is the reason it runs slowly, I've omitted the rest of the query to avoid confusion, since it is the distinct portion's slowness that I'm primarily concerned with (distinct is always a source of slowness).

The table in question has 2.5 million rows of data. The DISTINCT is needed for purposes not listed here (because I don't want back a modified query, but rather just general information about making distinct queries run faster at the DBMS level, if possible).

How can I make DISTINCT run quicker (using Postgres 9, specifically) without altering the SQL (ie, I can't alter this SQL coming in, but have access to optimize something at the DB level)?

解决方案

Your DISTINCT is causing it to sort the output rows in order to find duplicates. If you put an index on the column(s) selected by the query, the database may be able to read them out in index order and save the sort step. A lot will depend on the details of the query and the tables involved-- your saying you "know the problem is with the DISTINCT" really limits the scope of available answers.

这篇关于在Postgres中加快缓慢SELECT DISTINCT查询的解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆