在执行SQL语句时,可以在过滤之前应用标量函数吗? [英] Can scalar functions be applied before filtering when executing a SQL Statement?

查看:143
本文介绍了在执行SQL语句时,可以在过滤之前应用标量函数吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想我一直很天真地假设SQL查询的select部分中的标量函数只会应用于满足where子句所有条件的行.

I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.

今天,我正在调试供应商提供的一些代码,并对这一假设提出了挑战.我认为该代码失败的唯一原因是,对应该由WHERE子句过滤掉的数据调用了Substring()函数.但是似乎在过滤发生之前就应用了子字符串调用,查询失败了. 这是我的意思的一个例子.假设我们有两个表,每个表有2列,分别有2行和1行.每个字段的第一列只是一个ID. NAME只是一个字符串,NAME_LENGTH告诉我们名称中有多少个具有相同ID的字符.请注意,只有具有多个字符的名称在LONG_NAMES表中才会有相应的行.

Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing. Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.

NAMES: ID, NAME
    1, "Peter"
    2, "X"
LONG_NAMES: ID, NAME_LENGTH
    1, 5

如果我希望查询显示每个名字的前三个字母都被截断,我可能首先尝试这样的操作(现在假设使用SQL Server语法):

If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):

SELECT substring(NAME,1,len(NAME)-3)
    FROM NAMES;

我很快就会发现这会给我一个错误,因为当它达到"X"时,它将在子字符串调用中尝试使用负数,这将失败. 我的供应商决定解决此问题的方法是通过过滤掉字符串(对于len-3查询来说太短)而行太短的行.他通过加入另一张桌子来做到这一点:

I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail. The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:

SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3) 
    FROM NAMES 
        INNER JOIN LONG_NAMES 
            ON NAMES.ID = LONG_NAMES.ID;

乍一看,此查询看起来可能有效.连接条件将消除所有具有"NAME"字段足够短以至于子字符串调用失败的行.

At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.

但是,据我观察,SQL Server有时会尝试计算表中所有内容的子字符串表达式,然后然后应用联接来筛选出行.应该这样吗?是否有记录在案的操作顺序,我可以找出何时会发生某些事情?它特定于特定的数据库引擎还是SQL标准的一部分?如果我决定在NAMES表上包括一些谓词以过滤短名称(例如len(NAME)> 3),那么SQL Server是否可以在尝试应用子字符串后选择使用该谓词?如果是这样,那么做子字符串的唯一安全方法似乎是将它包装在select中的"case when"构造中?

However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?

推荐答案

Martin给出了此链接,该链接在很大程度上解释了正在发生的事情-查询优化器可以自由控制其喜欢的事物的顺序.我将其作为答案,这样我可以接受一些东西.马丁,如果您用链接创建一个答案,我会很乐意接受这个答案,而不是这个答案.

Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.

我确实想把我的问题留在这里,因为我认为这是一个棘手的问题,我对这个问题的特殊表述将来可能会让其他人更容易找到.

I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.

遇到TSQL除以零的情况尽管没有包含0的列

随着更多回复的到来,我再次感到困惑.尚不清楚何时确切允许优化器评估select子句中的内容.我想我得自己去查找SQL标准,看看我是否可以理解它.

As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.

这篇关于在执行SQL语句时,可以在过滤之前应用标量函数吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆