eclipselink jpa使用COUNT(id)生成计数查询,而不是COUNT(*) [英] eclipselink jpa generates count queries using COUNT(id) instead COUNT(*)

查看:316
本文介绍了eclipselink jpa使用COUNT(id)生成计数查询,而不是COUNT(*)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Eclipselink,Spring Data和Postgresql。在我的项目中,我注意到当使用SpringData存储库提供的分页结果时,有如下查询:

  SELECT COUNT(id)
FROM表格
WHERE [按规格生成的部分]

其中id是表的主键。挖掘解释我注意到COUNT(id)比COUNT()慢10倍左右(非常大的表)(count(id)在id列中查找非空值,count( )简单地返回匹配条件的行数),count(*)可以使用索引而count(id) - 不是。

我跟踪SpringData基本存储库类看起来只有JPA实现负责这个查询生成。


  1. 使用count(id)的原因是COUNT(* )?


  2. $ b
  3. 是否可以更改此行为(无论如何 - 甚至可以增强现有组件)?任何帮助表示赞赏

    - [edit] -

    有一张表:

      \d ord_order 
    表public.ord_order
    列|类型|修改者
    ------------------------- + ------------------- ------- + ------------------------------------------ ----------------
    id |整数| NOT NULL DEFAULT nextval('ord_order_id_seq':: regclass)
    test_order |布尔| DEFAULT false
    ...
    索引:
    pk_orderPRIMARY KEY,btree(id)
    idx_test_orderbtree(test_order)



    #解释SELECT COUNT(*)FROM ord_order WHERE(test_order = false);
    QUERY PLAN
    ---------------------------------------- ----------------------------------
    Aggregate(cost = 89898.79..89898.80 rows = 1 width = 0)
    - >仅索引使用ord_order上的idx_test_order进行扫描(cost = 0.43..85375.37 rows = 1809366 width = 0)
    索引条件:(test_order = false)
    过滤器:(NOT test_order)
    (4 wiersze )



    #解释SELECT COUNT(id)FROM ord_order WHERE(test_order = false);
    QUERY PLAN
    ---------------------------------------- ----------------------------------
    Aggregate(cost = 712924.52..712924.53 rows = 1 width = 4)
    - > Seq Scan on ord_order(cost = 0.00..708401.10 rows = 1809366 width = 4)
    Filter:(NOT test_order)
    (3 wiersze)

    现在差值是~90k vs〜713k和索引扫描与全扫描

    解决方案

    我设法使用该实现提供自定义Spring Data Repository基类实现和工厂。由于结果生成计数查询现在具有以下形式:

      SELECT COUNT(1)FROM table 

    与COUNT(*)具有相同的计划。这似乎是很好的解决方案,并在应用程序中的所有定义的存储库全局工作。



    我不知道如何生成COUNT(*),COUNT(1)因为COUNT函数需要一些表达式作为参数,我可以提供静态值 - 1


    I am using Eclipselink, Spring Data and Postgresql. In my project I noticed that when using paged results provided by SpringData repositories there are queries like:

    SELECT COUNT(id) 
    FROM table 
    WHERE [part generated according to specification]
    

    where "id" is the primary key of "table". Digging with explain I noticed that COUNT(id) is about 10 times slower than COUNT() for a very large table (count(id) looks for non-null values in "id" column while count( ) simply returns number of rows matching criteria), also count(* ) could make use of indexes while count(id) - not.

    I traced the SpringData basic repository class and it seems only JPA implementation is responsible for this query generation.

    1. what is the reason of using count(id) instead faster COUNT(* )?
    2. can I change this behaviour (anyway - even enhancing existing components)?

    any help appreciated

    -- [edit] --

    there is a table:

    \d ord_order
                                           Table "public.ord_order"
             Column          |           Type            |                       Modificators
    -------------------------+--------------------------+----------------------------------------------------------
     id                      | integer                  | NOT NULL DEFAULT nextval('ord_order_id_seq'::regclass)
     test_order              | boolean                  | DEFAULT false
    ...
    Indexes:
        "pk_order" PRIMARY KEY, btree (id)
        "idx_test_order" btree (test_order)
    
    
    
    # explain SELECT COUNT(*) FROM ord_order WHERE (test_order = false);
                                    QUERY PLAN
    --------------------------------------------------------------------------
     Aggregate  (cost=89898.79..89898.80 rows=1 width=0)
       ->  Index Only Scan using idx_test_order on ord_order  (cost=0.43..85375.37 rows=1809366 width=0)
             Index Cond: (test_order = false)
             Filter: (NOT test_order)
    (4 wiersze)
    
    
    
    # explain SELECT COUNT(id) FROM ord_order WHERE (test_order = false);
                                    QUERY PLAN
    --------------------------------------------------------------------------
     Aggregate  (cost=712924.52..712924.53 rows=1 width=4)
       ->  Seq Scan on ord_order  (cost=0.00..708401.10 rows=1809366 width=4)
             Filter: (NOT test_order)
    (3 wiersze)
    

    now the difference is ~90k vs ~713k and index scan vs. full scan

    解决方案

    I managed to supply custom Spring Data Repository base class implementation and factory using that implementation. As result generated count queries now have form of:

    SELECT COUNT(1) FROM table
    

    which has same plan as COUNT(* ). This seems to be fine solution and works globally for all defined repositories in application.

    I did not know how to generate COUNT(* ), COUNT(1) was much easier as COUNT function expects some expressions as parameters and I could supply static value - 1

    这篇关于eclipselink jpa使用COUNT(id)生成计数查询,而不是COUNT(*)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆