提示和技巧加快SQL [英] Tips and Tricks to speed up an SQL

查看:136
本文介绍了提示和技巧加快SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


可能重复:

这些是基础 SQL 函数和关键字。

These are the basics SQL Function and Keywords.

任何提示或窍门加快你的 SQL

Is there any tips or trick to speed up your SQL ?

我有一个查询与很多关键字。 ( AND,GROUP BY,ORDER BY,IN,BETWEEN,LIKE ...等)

For example; I have a query with a lot of keywords. (AND, GROUP BY, ORDER BY, IN, BETWEEN, LIKE... etc.)

应该在我的查询中顶部?
我如何决定它?

Which Keyword should be on top in my query? How can i decide it?

示例;

Where NUMBER IN (156, 646)
AND DATE BETWEEN '01/01/2011' AND '01/02/2011'

Where DATE BETWEEN '01/01/2011' AND '01/02/2011'
AND NUMBER IN (156, 646)

哪一个更快?取决于什么?

Which one is faster? Depends of what?

推荐答案

没有技巧。

鉴于数据库供应商之间有关更快的竞争,任何欺骗总是真的将在数据库本身实现。 (这些技巧在数据库中称为优化器的部分中实现)。

Given the competition between the database vendors about which one is "faster", any "trick" that is always true would be implemented in the database itself. (The tricks are implemented in the part of the database called "optimizer").

只有一些事情需要注意,但是它们通常不能被缩减为:

There are only things to be aware of, but they typically can't be reduced into:


  • 使用功能 X

  • 避免使用 Y

  • 模型

  • >
  • Use feature X
  • Avoid feature Y
  • Model like this
  • Never model like that

查看所有关于索引,索引类型,索引策略,聚类,单列键,复合键,引用完整性,访问路径,连接,连接机制,存储引擎,优化器行为,数据类型,归一化,查询变换,反规范化,过程,缓冲区高速缓存,结果集高速缓存,应用缓存,建模,聚合,函数,视图,索引视图,集合处理,列表继续。

Look at all the raging questions/discussions about indexes, index types, index strategies, clustering, single column keys, compound keys, referential integrity, access paths, joins, join mechanisms, storage engines, optimizer behaviour, datatypes, normalization, query transformations, denormalization, procedures, buffer cache, resultset cache, application cache, modeling, aggregation, functions, views, indexed views, set processing, procedural processing and the list goes on.

所有这些都是为了攻击特定问题而发明的。这个问题的变化使技巧或多或少适合。很多时候,技巧有零效果,有时有时平显可怕。为什么?因为当我们不明白为什么东西工作时,我们基本上只是抛弃问题的特征,直到它消失。

All of them was invented to attack a specific problem area. Variations on that problem make the "trick" more or less suitable. Very often the tricks have zero effect, and sometimes sometimes flat out horrible. Why? Because when we don't understand why something works, we are basically just throwing features at the problem until it goes away.

这里的要点是,有一个原因使查询更快速理解是什么东西是至关重要的过程,了解为什么一个不同的无关的查询是慢的,以及如何处理它。

The key point here is that there is a reason why something makes a query go faster, and the understanding of what that something is, is crucial to the process of understanding why a different unrelated query is slow, and how to deal with it. And it is never a trick, nor magic.

我们(人类)是懒惰的,我们想要抛出的鱼,当我们真正需要的是学习如何抓住它。

We (humans) are lazy, and we want to be thrown that fish when what we really need is to learn how to catch it.

现在,您想要捕获什么鱼?

Now, what specific fish do YOU want to catch?

strong>

您的谓词在where子句中的位置没有差别,因为它们的处理顺序由数据库决定。将影响该顺序的一些事情(例如):

Edited for comments:
The placement of your predicates in the where clause makes no difference since the order in which they are processed is determined by the database. Some of the things which will affect that order (for your example) are :


  • 是否可以根据索引视图重写查询

  • 可用的索引涵盖NUMBER和DATE列之一或两者,以及它们在索引中的顺序

  • 谓词,这基本上意味着您的谓词匹配的行的估计百分比。

  • 如果SQL Server考虑到查询中的成本,则使用聚类因子(或SQL Server中的任何名称)。这与索引条目的顺序如何与表行的物理顺序对准有关。

  • Whether or not the query can be rewritten against an indexed view
  • What indexes are available that covers one or both of columns NUMBER and DATE and in what order they exist in that index
  • The estimated selectivity of your predicates, which basically mean the estimated percentage of rows matched by your predicate. The lower % the more likely the optimizer is to use your index efficiently.
  • The clustering factor (or whatever the name is in SQL Server) if SQL Server factors that into the query cost. This has to do with how the order of the index entries aligns with the physical order of the table rows. Better alignment = reduces cost for higher % of rows fetched via that index.

现在,如果NUMBER列中的唯一值为156,646,它们几乎平均分布,一个索引将是无用的。全扫描将是一个更好的选择。

另一方面,如果这些是唯一的序号(由唯一索引支持),优化器将选择该索引并从那里驱动查询。类似地,如果具有在2011年1月的第一和第二之间的DATE的行构成了足够小的行,则将考虑以DATE为前导的索引。

Now, if the only values you have in column NUMBER are 156, 646 and they are pretty much evenly spread, an index would be useless. A full scan would be a better alternative.
On the other hand, if those are unique order numbers (backed by a unique index), the optimizer will pick that index and drive the query from there. Similarily, if the rows having a DATE between the first and second of January 2011 make up a small enough % of the rows, an index leading with DATE will be considered.

或者,如果您包括 order by NUMBER,DATE 另一个参数进入方程;排序成本。现在,(NUMBER,DATE)的索引对优化程序看起来更具吸引力,因为即使它不是获取行的最有效的方法,也可以跳过排序(这是昂贵的)。

Or if you include order by NUMBER, DATE another parameter comes into the equation; the cost of sorting. An index on (NUMBER, DATE) will now seem more attractive to the optimizer, because even though it might not be the most efficient way of aquiring the rows, the sorting (which is expensive) can be skipped.

或者,如果您的查询包括在customer_id上连接到另一个表(例如客户),并且您还有一个 customer.ssn 因为(因为你使用外键和后台索引做得很好),现在将有一个非常高效的访问路径到你的第一个表,而不使用NUMBER或DATE中的索引。除非你只有一个客户和所有1000万订单,他的... ...

Or, if your query included a join to another table (say customer) on customer_id and you also had a filter on customer.ssn, again the equation changes, because (since you did a good job with foreign keys and a backing index) you will now have a very efficient access path into your first table, without using the indexes in NUMBER or DATE. Unless you only have one customer and all of the 10 million orders where his...

这篇关于提示和技巧加快SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆