衡量SQL语句的复杂性 [英] Measuring the complexity of SQL statements

查看:170
本文介绍了衡量SQL语句的复杂性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大多数编程语言中方法的复杂性可以使用静态源代码分析器以循环复杂性进行度量。是否有类似的度量标准来衡量SQL查询的复杂性?

The complexity of methods in most programming languages can be measured in cyclomatic complexity with static source code analyzers. Is there a similar metric for measuring the complexity of a SQL query?

它很简单,可以衡量查询返回的时间,但是如果我只是想能够量化查询的复杂程度?

It is simple enough to measure the time it takes a query to return, but what if I just want to be able to quantify how complicated a query is?

[Edit / Note]
虽然获得执行计划很有用,但这不一定是我要做的在这种情况下试图识别。我不是在寻找服务器执行查询的难易程度,而是在寻找一种指标,该指标可以确定开发人员编写查询的难易程度以及包含缺陷的可能性。

While getting the execution plan is useful, that is not necessarily what I am trying to identify in this case. I am not looking for how difficult it is for the server to execute the query, I am looking for a metric that identifies how difficult it was for the developer to write the query, and how likely it is to contain a defect.


诚然,有时测量复杂度没有用,但有时也没有用。有关该主题的进一步讨论,请参见此问题

推荐答案

软件复杂度的常用度量包括循环复杂度(衡量控制流程的复杂程度)和 Halstead复杂度(算术运算的复杂度)。

Common measures of software complexity include Cyclomatic Complexity (a measure of how complicated the control flow is) and Halstead complexity (a measure of complex the arithmetic is).

SQL查询中的控制流与 and和 or最相关

The "control flow" in a SQL query is best related to "and" and "or" operators in query.

计算复杂度与SUM或隐式JOINS等运算符最相关。

The "computational complexity" is best related to operators such as SUM or implicit JOINS.

一旦确定了如何对SQL查询的语法的每个单位进行分类,以决定它是控制流还是计算,就可以直接计算出环量或Halstead量度。

Once you've decided how to categorize each unit of syntax of a SQL query as to whether it is "control flow" or "computation", you can straightforwardly compute Cyclomatic or Halstead measures.

SQL优化器对我认为的查询所做的工作绝对不相关。复杂性度量的目的是表征一个人理解查询的难易程度,而不是其评估效率。

What the SQL optimizer does to queries I think is absolutely irrelevant. The purpose of complexity measures is to characterize how hard is to for a person to understand the query, not how how efficiently it can be evaluated.

同样,DDL所说的或是否涉及视图不应该包含在这种复杂性度量中。这些指标背后的假设是,仅当调用抽象时,二手抽象内部的机器复杂性就不会引起人们的兴趣,因为大概抽象可以使编码人员很好地理解某些东西。这就是为什么Halstead和Cyclomatic度量在其计数中不包括所谓的子例程的原因,我认为您可以很好地证明视图和DDL信息是那些调用的抽象概念。

Similarly, what the DDL says or whether views are involved or not shouldn't be included in such complexity measures. The assumption behind these metrics is that the complexity of machinery inside a used-abstraction isn't interesting when you simply invoke it, because presumably that abstraction does something well understood by the coder. This is why Halstead and Cyclomatic measures don't include called subroutines in their counting, and I think you can make a good case that views and DDL information are those "invoked" abstractractions.

最后,只要这些复杂度数字反映了有关复杂度的真相,并且您可以将它们相互之间进行比较,那么这些复杂度数字到底有多正确还是有多错并不重要。这样,您可以选择最复杂的SQL片段,然后对它们进行排序,并将测试重点放在最复杂的SQL片段上。

Finally, how perfectly right or how perfectly wrong these complexity numbers are doesn't matter much, as long they reflect some truth about complexity and you can compare them relative to one another. That way you can choose which SQL fragments are the most complex, thus sort them all, and focus your testing attention on the most complicated ones.

这篇关于衡量SQL语句的复杂性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆