列数会影响查询性能吗? [英] Does the number of columns affect query performance?

查看:49
本文介绍了列数会影响查询性能吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

案例 1:我有一个包含 30 列的表,我在 where 子句中使用 4 列进行查询.

CASE 1: I have a table with 30 columns and I query using 4 columns in the where clause.

案例 2:我有一个包含 6 列的表,我在 where 子句中使用 4 列进行查询.

CASE 2: I have a table with 6 columns and I query using 4 columns in the where clause.

这两种情况下的性能有何不同?

What is the difference in performance in both cases?

例如我有桌子

table A
{
  b varchar(10),
  c varchar(10),
  d varchar(10),
  e varchar(10),
  f varchar(10),
  g varchar(10),
  h varchar(10)

}

SELECT b,c,d
FROM A
WHERE f='foo'

create table B
{
  b varchar(10),
  c varchar(10),
  d varchar(10),
  e varchar(10),
  f varchar(10)

}

SELECT b,c,d
FROM B
WHERE f='foo'

A 和 B 表具有相同的结构意味着仅在条件相同且选择中的列也相同时使用的列数和列数不同.不同之处在于表 B 只有一些未使用的列,这些列未在 select 和 where 条件中使用在这种情况下,两个查询的性能有什么不同吗?

Both A And B table have same structure means only difference in number of column and column used in where condition is also same and column in select is also same. difference is that table B only have some unused column these are not being used in select and where condition in that case is there any difference in performance of both queries ?

推荐答案

表中的总列数是否会影响性能(如果选择了相同的列子集,并且表上没有索引)

Does the total number of columns in a table impact performance (if the same subset of columns is selected, and if there are no indices on the table)

是的,稍微,根本没有索引,两个查询(表 A 和表 B)都将执行表扫描.鉴于Table B 的列数比Table A 少,B 每页的行数(密度)会更高,所以B 会稍微快一点,因为需要获取的页面更少.

Yes, marginally, with no indexes at all, both queries (Table A and Table B) will do table scans. Given that Table B has fewer columns than Table A, the rows per page (density) will be higher on B and so B will be marginally quicker as fewer pages need to be fetched.

但是,鉴于您的查询采用以下形式:

However, given that your queries are of the form:

SELECT b,c,d
FROM X
WHERE f='foo';

查询的性能将取决于列 f 上的索引,而不是基础表中的列数.

the performance of the query will be dominated by the indexing on column f, rather than the number of columns in the underlying tables.

对于 OP 的精确查询,最快的性能将来自以下索引:

For the OP's exact queries, the fastest performance will result from the following indexing:

  • A(f) INCLUDE (b,c,d) 上的索引
  • B(f) INCLUDE (b,c,d) 上的索引

不管表 A 或表 B 中的列数是多少,有了上述索引,两个查询的性能应该相同(假设两个表中的行数和数据相似),因为 SQL 将命中索引现在具有相似的列宽和行密度,不需要来自原始表的任何额外数据.

Irrespective of the number of columns in Table A or Table B, with the above indexes in place, performance should be identical for both queries (assuming the same number of rows and similar data in both tables), given that SQL will hit the indexes which are now of similar column widths and row densities, without needing any additional data from the original table.

select 中的列数会影响查询性能吗?

Does the number of columns in the select affect query performance?

SELECT 中返回较少列的主要好处是 SQL 可能能够避免从表/集群中读取,相反,如果它可以检索所有selected 来自索引的数据(作为索引列和/或在覆盖索引).

The main benefit of returning fewer columns in a SELECT is that SQL might be able to avoid reading from the table / cluster, and instead, if it can retrieve all the selected data from an index (either as indexed columns and / or included columns in the case of a covering index).

显然,谓词中使用的列(where 过滤器),即您示例中的 fMUST 位于索引的索引列中,并且数据分布必须足够 选择性,以便首先使用索引.

Obviously, the columns used in the predicate (where filter), i.e. f in your example, MUST be in the indexed columns of the index, and the data distribution must be sufficiently selective, in order for an index to be used in the first place.

SELECT 返回更少的列还有一个次要好处,因为这将减少任何 I/O 开销,特别是如果数据库服务器和使用数据的应用程序 - 即最好只返回您实际需要的列,并避免使用 SELECT *.

There is also a secondary benefit in returning fewer columns from a SELECT, as this will reduce any I/O overhead, especially if there is a slow network between the Database server and the app consuming the data - i.e. it is good practice to only ever return the columns you actually need, and to avoid using SELECT *.

编辑

其他一些计划:

  • B(f) 上的索引,没有其他键或 INCLUDE 列,或者有一组不完整的 INCLUDE 列(即一个或更多的 b、c 或 d 缺失):
  • Index on B(f) with no other key or INCLUDE columns, or with an incomplete set of INCLUDE columns (i.e. one or more of b, c or d are missing):

SQL Server 可能需要做一个 Key或 RID Lookup 就像索引一样使用时,将需要连接"回表以检索 select 子句中丢失的列.(查找类型取决于表是否有聚簇PK)

SQL Server will likely need to do a Key or RID Lookup as even if the index is used, there will be a need to "join" back to the table to retrieve the missing columns in the select clause. (The lookup type depends on whether the table has a clustered PK or not)

  • B(f,b,c,d)
  • 上的直接非聚集索引
  • Straight non clustered index on B(f,b,c,d)

这仍然会非常高效,因为将使用索引并避免使用表,但是 不会像覆盖索引,因为索引中增加了额外的键列,索引树的密度会变小.

This will still be very performant, as the index will be used and the table avoided, but won't be quite as good as the covering index, because the density of the index tree will be less due to the additional key columns in the index.

这篇关于列数会影响查询性能吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆