SQL Performance UNION与OR [英] SQL Performance UNION vs OR

查看:101
本文介绍了SQL Performance UNION与OR的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚阅读了优化文章的一部分,并对以下语句进行了 segfaulted :

I just read part of an optimization article and segfaulted on the following statement:

在使用带有UNIONOR的SQL替换语句时:

When using SQL replace statements using OR with a UNION:

select username from users where company = ‘bbc’ or company = ‘itv’;

收件人:

select username from users where company = ‘bbc’ union
select username from users where company = ‘itv’;

通过快速EXPLAIN:

使用OR:

使用UNION:

这不是说UNION会使的工作量翻倍吗??

尽管我欣赏UNION对于某些RDBMS和某些表模式可能更有效,但这在作者看来不是绝对正确的.

While I appreciate UNION may be more performant for certain RDBMSes and certain table schemas, this is not categorically true as the author suggestions.

我错了吗?

推荐答案

您阅读的文章使用了错误的示例,或者您误解了他们的观点.

Either the article you read used a bad example, or you misinterpreted their point.

select username from users where company = 'bbc' or company = 'itv';

这等效于:

select username from users where company IN ('bbc', 'itv');

对于此查询,MySQL可以在company上使用索引.无需执行任何UNION.

MySQL can use an index on company for this query just fine. There's no need to do any UNION.

更棘手的情况是,您有一个包含两个不同列的OR条件.

The more tricky case is where you have an OR condition that involves two different columns.

select username from users where company = 'bbc' or city = 'London';

假设在company上有一个索引,在city上有一个单独的索引.考虑到MySQL通常在给定查询中每个表只使用一个索引,应该使用哪个索引?如果它使用company上的索引,它仍必须进行表扫描以查找city为London的行.如果它使用city上的索引,则必须对company是bbc的行进行表扫描.

Suppose there's an index on company and a separate index on city. Given that MySQL usually uses only one index per table in a given query, which index should it use? If it uses the index on company, it would still have to do a table-scan to find rows where city is London. If it uses the index on city, it would have to do a table-scan for rows where company is bbc.

UNION解决方案适用于这种情况.

The UNION solution is for this type of case.

select username from users where company = 'bbc' 
union
select username from users where city = 'London';

现在每个子查询都可以使用索引进行搜索,并且子查询的结果由UNION组合.

Now each sub-query can use the index for its search, and the results of the subquery are combined by the UNION.

一个匿名用户建议对我的回答进行编辑,但主持人拒绝了该编辑.它应该是评论,而不是编辑.提议的编辑声称,UNION必须对结果集进行排序以消除重复的行.这会使查询运行速度变慢,因此索引优化是一项艰巨的任务.

An anonymous user proposed an edit to my answer above, but a moderator rejected the edit. It should have been a comment, not an edit. The claim of the proposed edit was that UNION has to sort the result set to eliminate duplicate rows. This makes the query run slower, and the index optimization is therefore a wash.

我的回答是,索引有助于在UNION发生之前将结果集减少为少量的行.实际上,UNION确实消除了重复项,但这样做只需要对小的结果集进行排序.在某些情况下,WHERE子句与表的大部分匹配,并且在UNION期间进行排序与​​进行表扫描一样昂贵.但是,通过索引搜索减少结果集的情况更为普遍,因此排序的成本要比表扫描的成本低得多.

My response is that that the indexes help to reduce the result set to a small number of rows before the UNION happens. UNION does in fact eliminate duplicates, but to do that it only has to sort the small result set. There might be cases where the WHERE clauses match a significant portion of the table, and sorting during UNION is as expensive as simply doing the table-scan. But it's more common for the result set to be reduced by the indexed searches, so the sorting is much less costly than the table-scan.

差异取决于表中的数据以及要搜索的术语.确定给定查询的最佳解决方案的唯一方法是尝试

The difference depends on the data in the table, and the terms being searched. The only way to determine the best solution for a given query is to try both methods in the MySQL query profiler and compare their performance.

这篇关于SQL Performance UNION与OR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆