在大量数据上选择Count(*) [英] Select Count(*) over large amount of data

查看:245
本文介绍了在大量数据上选择Count(*)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为报表执行此操作,但是我的表中有20,000,000条记录,这导致我的应用程序超时.

I want to do this for a Report but i have 20,000,000 of records in my table and it causes an TimeOut in my application.

SELECT
        T.transactionStatusID,
        TS.shortName AS TransactionStatusDefShortName,
        count(*) AS qtyTransactions
    FROM
        Transactions T 

    INNER JOIN TransactionTypesCurrencies TTC
                ON  T.id_Ent = TTC.id_Ent
                    AND T.trnTypeCurrencyID = TTC.trnTypeCurrencyID
            INNER JOIN TransactionStatusDef TS
                ON  T.id_Ent = TS.ent_Ent
                AND T.transactionStatusID = TS.ID
WHERE
    T.id_Ent = @id_Ent
GROUP BY
        T.transactionStatusID,
        TS.shortName

据我所知COUNT(*)导致全表扫描,这使我的查询花费了太多时间,即时通讯使用MS SQL 2005,有什么帮助吗?

as far as i know COUNT(*) causes a full table scan and it makes my query to take too much time, im Using MS SQL 2005, any help ?

项目负责人告诉我,查询只用了一天就可以提供帮助?

The project leader tells me that the Query is just for one day it could help?

推荐答案

据我所知COUNT(*)导致全表扫描,这使我的查询花费了太多时间,即时通讯使用MS SQL 2005,有什么帮助吗?

as far as i know COUNT(*) causes a full table scan and it makes my query to take too much time, im Using MS SQL 2005, any help ?

COUNT(*)可以使用能够给出答案的任何来源,其中包括索引.

COUNT(*) can use any source that is able to give the answer, this includes indexes.

在您的情况下,我将使用trnTypeCurrencyID(id_ent, transactionStatusID)上创建覆盖索引:

In your very case, I'd create a covering index on (id_ent, transactionStatusID) with trnTypeCurrencyID:

CREATE INDEX ON Transactions (id_ent, transactionStatusID) INCLUDE (trnTypeCurrencyID)

然后稍微重写一下查询:

and rewrite the query a little:

SELECT  transactionStatusID, qtyTransactions, TS.shortName
FROM    (
        SELECT  T.transactionStatusID,
                COUNT(*) AS qtyTransactions
        FROM    Transactions T
        JOIN    TransactionTypesCurrencies TTC
        ON      TTC.id_Ent = T.id_Ent
                AND TTC.trnTypeCurrencyID = T.trnTypeCurrencyID
        WHERE   T.id_Ent = @id_Ent
        GROUP BY
                T.transactionStatusID
        ) TD
JOIN    TransactionStatusDef TS
ON      TS.ent_Ent = @id_Ent
        AND TS.ID = TD.transactionStatusID

索引将在id_ent上过滤并在transactionStatusID上并行化.既然已经涵盖了trnTypeCurrencyID,引擎将不必在表中查找值,因为它已经存在于索引中.

The index will filter on id_ent and parallelize on transactionStatusID. Since you have trnTypeCurrencyID covered, the engine will not have to lookup the value in the table, it's already present in the index.

GROUP BY子句还仅包含索引中的列,因此可以更好地并行化.

The GROUP BY clause also includes only the columns from the index so it parallelizes much better.

更新:

通过添加WITH (ONLINE = ON),您可以使表在创建索引时保持可操作状态:

By adding WITH (ONLINE = ON) you can leave the table operational for the time the index is being created:

CREATE INDEX ON Transactions (id_ent, transactionStatusID) INCLUDE (trnTypeCurrencyID) WITH (ONLINE = ON)

这篇关于在大量数据上选择Count(*)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆