PostgreSQL:计数查询花费太多时间 [英] PostgreSQL: count query takes too much time

查看:237
本文介绍了PostgreSQL:计数查询花费太多时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的查询存在一些问题-它花费了太多时间( 2636124 ms!):

  SELECT COUNT(*)AS __count 
from dictionary_dictionary
WHERE NOT( dictionary_dictionary。 id IN(SELECT U1。 word_id AS Col1
FROM dictionary_frequencydata U1
U1。 user_id = 1)));

此查询由ORM(Django)生成。当我尝试执行它(使用ORM)时,我的应用程序挂起,并且当我放入psql时-psql挂起。



EXPLAIN ANALYZE:

 总计(成本= 329583550.40..329583550.41行= 1宽度= 8)(实际
时间= 2636109.932..2636109.933行= 1循环= 1 )
->对dictionary_dictionary的Seq扫描(费用= 0.00..329583390.76
行= 63856宽度= 0)(实际时间= 2636109.922..2636109.922行= 0循环= 1)
过滤器:(不(子计划1))
被过滤器删除的行:127712
子计划1
->实现(成本= 0.00..4821.74行= 135828宽度= 4)(实际时间= 0.0006..12.453行= 63856循环= 127712)
->对dictionary_frequencydata u1的Seq扫描(成本= 0.00..3611.60行= 135828宽度= 4)(实际时间= 0.299..95.915行= 127712循环= 1)
过滤器:(user_id = 1)
行由过滤器删除:28054
计划时间:0.277 ms
执行时间:2636124.744 ms
(11 wierszy)`

我来自Django的模型

  class Dictionary(DateTimeModel):
base_word = models.ForeignKey(BaseDictionary,related_name = _('dict_words'))
word = models.CharField(max_length = 64)
version = models.ForeignKey(Version)

class FrequencyData(DateTimeModel):
word = models.ForeignKey(Dictionary,related_name = _('frequency_data'))
count = models.BigIntegerField(null = True,blank = True)
源= models.ForeignKey(来源,related_name = _('frequency_data'),null = True,空白= True)
用户= models.ForeignKey(settings.AUTH_USER_MODEL,related_name = _('frequency_data'))
user_ip_address = models.GenericIPAddressField(null = True,blank = True)
date_of_checking = models.DateTimeField(null = True,blank = True)
is_checked = models.BooleanField(default = False)

表定义:

  \d + dictionary_dictionary 
Tabela public.dictionary_dictionary
Kolumna | Typ | Porównanie| Nullowalne | Domyślnie| Przechowywanie | Cel statystyk | Opis
---------------------- + ---------------------- ---- + ------------ + ------------ + ------------------- ------------------------------------------------- + ---------------- + ------------------- + ------
id |整数| |不为空| nextval(’dictionary_dictionary_id_seq’:: regclass)|普通|
date_created |带时区的时间戳记| |不为空| |普通|
date_modified |带时区的时间戳记| |不为空| |普通|
个字|字符变化(64)| |不为空| |扩展| |
algorithm_version_id |整数| |不为空| |普通|
base_word_id |整数| |不为空| |普通|

Indeksy:
dictionary_dictionary_pkey主键,btree(id)
dictionary_phonet_algorithm_version_id_0f0af100 btree(algorithm_version_id)
dictionary_dictionary_base_word_id_b_b_b_b
Ograniczenia kluczy obcych:
dictionary__algorithm_version_id_0f0af100_fk_phonetic_ 外键(algorithm_version_id)参考dictionary_algorithmversion(ID)DEFERRABLE INITIALLY DEFERRED
dictionary__base_word_id_8db15cb4_fk_phonetic_ 外键(base_word_id)参考dictionary_grammaticaldictionary(ID)DEFERRABLE INITIALLY DEFERRED

Wskazywany przez:
表 dictionary_frequencydata CONSTRAINT dictionary__word_id_c231110d_fk_phonetic_外键(word_id)参考dictionary_dictionary(id)DEFERRABLE INITIALLYLY DEFERRED == b
$ b = =
\d + dictionary_frequencydata
Tabela public.dictionary_frequencydata
Kolumna | Typ | Porównanie| Nullowalne | Domyślnie| Przechowywanie | Cel statystyk | Opis
------------------ + -------------------------- + ------------ + ------------ + ----------------------- ---------------------------------------- + --------- ------- + --------------- + ------
id |整数| |不为空| nextval(’dictionary_frequencydata_id_seq’:: regclass)|普通|
date_created |带时区的时间戳记| |不为空| |普通|
date_modified |带时区的时间戳记| |不为空| |普通|
个计数| bigint | | | |普通|
user_ip_address | inet | | | |主| |
date_of_checking |带时区的时间戳记| | | |普通|
is_checked |布尔值| |不为空| |普通|
source_id |整数| | | |普通|
user_id |整数| |不为空| |普通|
word_id |整数| |不为空| |普通|

Indeksy:
dictionary_frequencydata_pkey主键,btree(id)
dictionary_frequencydata_source_id_38bb205a btree(source_id)
dictionary_frequencydata_user_id_c6dfedce btree $ b(user_id) dictionary_frequencydata_word_id_c231110d btree(word_id)

Ograniczenia kluczy obcych:
dictionary__source_id_38bb205a_fk_phonetic_ FOREIGN KEY(source_id)REFERENCES dictionary_frequency_id_ce_fed_id_ce_fed_d )参考auth_user(id)最初可延迟
dictionary__word_id_c231110d_fk_phonetic_外键(word_id)参考dictionary_dictionary(id)最初可延迟
$$
$ b

它是共享主机。
词典数据库表-12万行FrequencyData-16万行

解决方案

尝试添加 DISTINCT 关键字,这应该缩小ID的选中子集:

  SELECT COUNT(*)AS __count 
FROM dictionary_dictionary
WHERE NOT( dictionary_dictionary。 id IN(选择不同的U1。 word_id AS Col1
从 dictionary_frequencydata U1
W1位置U1。 user_id = 1));


I have some problems with my query - it takes too much time (2636124 ms!):

 SELECT COUNT(*) AS "__count" 
 FROM "dictionary_dictionary" 
 WHERE NOT ("dictionary_dictionary"."id" IN (SELECT U1."word_id" AS Col1 
                                             FROM "dictionary_frequencydata" U1 
                                             WHERE U1."user_id" = 1));

This query is generated by ORM (Django). When I try to execute it (with ORM) my app hangs and also when I put in to psql - psql hangs.

EXPLAIN ANALYZE:

Aggregate  (cost=329583550.40..329583550.41 rows=1 width=8) (actual 
time=2636109.932..2636109.933 rows=1 loops=1)
   ->  Seq Scan on dictionary_dictionary  (cost=0.00..329583390.76 
       rows=63856 width=0) (actual time=2636109.922..2636109.922 rows=0 loops=1)
           Filter: (NOT (SubPlan 1))
           Rows Removed by Filter: 127712
           SubPlan 1
             ->  Materialize  (cost=0.00..4821.74 rows=135828 width=4) (actual time=0.006..12.453 rows=63856 loops=127712)
                ->  Seq Scan on dictionary_frequencydata u1  (cost=0.00..3611.60 rows=135828 width=4) (actual time=0.299..95.915 rows=127712 loops=1)
                     Filter: (user_id = 1)
                     Rows Removed by Filter: 28054
 Planning time: 0.277 ms
 Execution time: 2636124.744 ms
 (11 wierszy)`

My models from Django

class Dictionary(DateTimeModel):
    base_word = models.ForeignKey(BaseDictionary, related_name=_('dict_words'))
    word = models.CharField(max_length=64)
    version = models.ForeignKey(Version)

class FrequencyData(DateTimeModel):
    word = models.ForeignKey(Dictionary, related_name=_('frequency_data'))
    count = models.BigIntegerField(null=True, blank=True)
    source = models.ForeignKey(Source, related_name=_('frequency_data'), null=True, blank=True)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name=_('frequency_data'))
    user_ip_address = models.GenericIPAddressField(null=True, blank=True)
    date_of_checking = models.DateTimeField(null=True, blank=True)
    is_checked = models.BooleanField(default=False)

The table definitions:

\d+ dictionary_dictionary
                                                                 Tabela "public.dictionary_dictionary"
       Kolumna        |           Typ            | Porównanie | Nullowalne |                             Domyślnie                              | Przechowywanie | Cel statystyk | Opis 
----------------------+--------------------------+------------+------------+--------------------------------------------------------------------+----------------+---------------+------
 id                   | integer                  |            | not null   | nextval('dictionary_dictionary_id_seq'::regclass) | plain          |               | 
 date_created         | timestamp with time zone |            | not null   |                                                                    | plain          |               | 
 date_modified        | timestamp with time zone |            | not null   |                                                                    | plain          |               | 
 word                 | character varying(64)    |            | not null   |                                                                    | extended       |               | 
 algorithm_version_id | integer                  |            | not null   |                                                                    | plain          |               | 
 base_word_id         | integer                  |            | not null   |                                                                    | plain          |               | 

Indeksy:
    "dictionary_dictionary_pkey" PRIMARY KEY, btree (id)
    "dictionary_phonet_algorithm_version_id_0f0af100" btree (algorithm_version_id)
    "dictionary_dictionary_base_word_id_8db15cb4" btree (base_word_id)

Ograniczenia kluczy obcych:
    "dictionary__algorithm_version_id_0f0af100_fk_phonetic_" FOREIGN KEY (algorithm_version_id) REFERENCES dictionary_algorithmversion(id) DEFERRABLE INITIALLY DEFERRED
    "dictionary__base_word_id_8db15cb4_fk_phonetic_" FOREIGN KEY (base_word_id) REFERENCES dictionary_grammaticaldictionary(id) DEFERRABLE INITIALLY DEFERRED

Wskazywany przez:
    TABLE "dictionary_frequencydata" CONSTRAINT "dictionary__word_id_c231110d_fk_phonetic_" FOREIGN KEY (word_id) REFERENCES dictionary_dictionary(id) DEFERRABLE INITIALLY DEFERRED

=========
\d+ dictionary_frequencydata
                                                               Tabela "public.dictionary_frequencydata"
     Kolumna      |           Typ            | Porównanie | Nullowalne |                           Domyślnie                           | Przechowywanie | Cel statystyk | Opis 
------------------+--------------------------+------------+------------+---------------------------------------------------------------+----------------+---------------+------
 id               | integer                  |            | not null   | nextval('dictionary_frequencydata_id_seq'::regclass) | plain          |               | 
 date_created     | timestamp with time zone |            | not null   |                                                               | plain          |               | 
 date_modified    | timestamp with time zone |            | not null   |                                                               | plain          |               | 
 count            | bigint                   |            |            |                                                               | plain          |               | 
 user_ip_address  | inet                     |            |            |                                                               | main           |               | 
 date_of_checking | timestamp with time zone |            |            |                                                               | plain          |               | 
 is_checked       | boolean                  |            | not null   |                                                               | plain          |               | 
 source_id        | integer                  |            |            |                                                               | plain          |               | 
 user_id          | integer                  |            | not null   |                                                               | plain          |               | 
 word_id          | integer                  |            | not null   |                                                               | plain          |               | 

Indeksy:
    "dictionary_frequencydata_pkey" PRIMARY KEY, btree (id)
    "dictionary_frequencydata_source_id_38bb205a" btree (source_id)
    "dictionary_frequencydata_user_id_c6dfedce" btree (user_id)
    "dictionary_frequencydata_word_id_c231110d" btree (word_id)

Ograniczenia kluczy obcych:
    "dictionary__source_id_38bb205a_fk_phonetic_" FOREIGN KEY (source_id) REFERENCES dictionary_frequencysource(id) DEFERRABLE INITIALLY DEFERRED
    "dictionary__user_id_c6dfedce_fk_auth_user" FOREIGN KEY (user_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
    "dictionary__word_id_c231110d_fk_phonetic_" FOREIGN KEY (word_id) REFERENCES dictionary_dictionary(id) DEFERRABLE INITIALLY DEFERRED

It's shared hosting. Dictionary db tabel - 120k rows FrequencyData - 160k rows

解决方案

Try adding DISTINCT keyword, which should narrow the checked subset of ids:

SELECT COUNT(*) AS "__count" 
FROM "dictionary_dictionary" 
WHERE NOT ("dictionary_dictionary"."id" IN (SELECT distinct U1."word_id" AS Col1
                                            FROM "dictionary_frequencydata" U1 
                                            WHERE U1."user_id" = 1));

这篇关于PostgreSQL:计数查询花费太多时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆