当使用"NOT IN"时,子查询非常慢. [英] Very slow subqueries when using "NOT IN"

查看:282
本文介绍了当使用"NOT IN"时,子查询非常慢.的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为包含在大型现有Access数据库中的数据生成报告(压缩和修复后约500 mb),并且我在处理慢速子查询时遇到了麻烦.

I'm working on generating reports for data contained within a large pre-existing Access database (~500 mb after compact & repair), and I'm having trouble with a slow subquery.

数据库有一个大表,其中包含每个客户购买的记录.这是一个简单的查询,它查找购买了蓝色小部件的客户.它会在几秒钟内完成,并返回大约一万条记录.

The database has a big table which contains a record of every customer purchase. Here's a simple query which finds customers who have bought a blue widget. It completes within a few seconds and returns about ten thousand records.

SELECT DISTINCT CustomerId 
FROM ProductSales
WHERE Product = 'BLUE' 

这是一个查询,用于查找已购买蓝色小部件但未购买红色小部件的客户.运行大约需要一个小时.

Here's a query which tries to find customers who have bought a blue widget, but not a red widget. It takes about an hour to run.

SELECT DISTINCT CustomerId FROM ProductSales
WHERE Product = 'BLUE' 
AND CustomerId NOT IN (
    SELECT CustomerId 
    FROM ProductSales 
    WHERE Product = 'RED'
)

是否有一种方法可以重构第二个查询,使其花费几分钟而不是一个小时?

Is there a way to refactor the second query to make it take a few minutes instead of an hour?

推荐答案

Access'数据库引擎无法为Not In使用索引,因此它一定很慢.有了CustomerId的索引,此查询应该会更快,因为数据库引擎可以使用该索引.

Access' database engine can't use an index for Not In, so it's bound to be slow. With an index on CustomerId, this query should be much faster because the db engine can use the index.

SELECT DISTINCT blue.CustomerId
FROM
    ProductSales AS blue
    LEFT JOIN
        (
            SELECT CustomerId 
            FROM ProductSales 
            WHERE Product = 'RED'
        ) AS red
    ON blue.CustomerId = red.CustomerId
WHERE
        blue.Product = 'BLUE'
    AND red.CustomerId Is Null; 

您可能还可以尝试使用Not Exists方法,但是不能保证在那里使用索引.另外,请参阅以下David Fenton的评论,其中更详细地讨论了性能影响.

You could probably also try a Not Exists approach, but index use there is not guaranteed. Also, please see the comment below from David Fenton which discusses performance impact in more detail.

这篇关于当使用"NOT IN"时,子查询非常慢.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆