SQL连接以下外键:静态检查LHS是否保留键 [英] SQL join following foreign key: statically check that LHS is key-preserved

查看:192
本文介绍了SQL连接以下外键:静态检查LHS是否保留键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常你在外键之后连接两个表,这样RHS表中的行总是被找到。添加联接不会影响查询所影响的行数。例如

  create table a(x int not null primary key)
create table b(x int not null primary key,y int not null)
alter table a添加外键(x)引用b(x)


$ b $现在,假设你在这两个表中设置了一些数据,你可以得到一定数量的行:

 
中选择x

按照外键添加一个连接到b不会改变这个: / b>

 从ax = bx上的连接b中选择ax从

但是,一般情况下,这可能会过滤出一些行或(通过笛卡尔乘积)添加更多:

 从ax = bx的连接b中选择ax并且!= 42  - 可能会产生更少的行

从ax上的连接b中选择ax! = by - 可能会给出更多的行

读取SQL代码时, 加入是键保持类型,它可能会添加额外的列,但不会更改返回的行数,或者是否有其他影响。随着时间的推移,我开发了一个我主要遵循的编码约定:如果一个键保持连接,使用


  • 如果要过滤行,可以加入

  • ,将过滤条件放在其中子句中

  • 如果想要更多的行,有时候 cross join 对于笛卡尔积是最明显的方法。 b

    这些通常只是样式问题,因为通常可以将谓词放入 join 子句或,其中子句,例如。

    我的问题

    在编译查询时,是否有某种方法可以让数据库服务器静态检查这些键保持连接?我知道查询优化器已经知道外键上的连接总是在外键指向的表中正好找到一行。但我想在我的SQL代码中标记它,以便让读者受益。例如,假设新的语法 fkjoin 用于外键之后的联接。然后下面的SQL片段会给出错误:

    $ ax
    $ - $ $ $ $ $在ax = bx和by = 42上的b $ ba fkjoin b错误,连接可能由于额外的谓词而失败

    a fkjoin b on ax = by - Error,no foreign key from ax以

    这将是一个有用的检查,当我写SQL时,也是当返回稍后阅读。我理解并接受,改变数据库中的外键会改变这个方案下的SQL是合法的 - 对我而言,这是一个理想的结果,因为如果必要的FK不复存在,那么查询的键保持语义是更长的保证,我想了解它。

    潜在地,可能有一些外部的SQL静态检查器工具,工作,特殊评论语法可以被使用,而不是一个新的关键字。检查器工具需要访问数据库模式来查看存在哪些外键,但是不需要真正执行查询。



    有没有什么东西可以做我想?我正在使用MSSQL 2008 R2。 (Microsoft SQL Server for the pedantic)

    解决方案

    我意识到您有兴趣指出特定列上的特定连接是否处于一个FK,或者是一个限制,或者也许是某种其他情况,或者前面没有。 (并不清楚加入的成功或失败或者它的相关性意味着什么。)如下所述,关注这些信息就是忽略关注更重要和更基本的东西。



    基本表有一个含义或谓词(表达式),它是由DBA给出的一个填充(命名)空白语句。该语句的空白名称是该表的列。填补空白的行使世界的一个真正的主张在桌上。填补空白,以作出关于世界的错误主张的行被排除在外。也就是说,一个表包含满足其声明的行。如果不知道它的语句,观察世界并将相应的行放入表格中,则无法将基表设置为某个值。除了知道它的陈述,并且把现在的行命命为真,不存在的命题是假的以外,你不可能从基表中知道世界。即你需要它的语句来使用数据库。

    请注意,表声明的典型语法看起来像是它的声明的简写:

       - 雇员[eid]被命名为[name],并住在[address]中... 
    EMPLOYEE(eid,name,address ,. ..)

    您可以通过将逻辑运算符AND,OR,NOT,EXISTS name ,AND condition 等。如果您通过将表的语句转换为名称将


    • 转换为关系/ SQL表达式,那么
    • AND JOIN

    • 或至 UNION

    • 而不是 EXCEPT / MINUS

    • EXISTS 但是C,... FROM ...
    • 条件
    • IMPLIES至 SUBSETOF
    • IFF to =


    然后你得到一个关系表达式,使声明为真的行。 ( UNION & EXCEPT / MINUS 的参数需要)因此,就像每个表都包含满足其语句的行一样,查询表达式也包含满足其语句的行。除了知道它的陈述,并且把它现在的行命题是真的和不存在的行命题是假的以外,你无法从查询结果中知道这个世界。即你需要它的陈述来撰写或解释一个查询。 (注意,无论约束条件如何,这都是真的。)

    这是关系模型的基础:表达式计算满足相应语句的行。

    (如果SQL有所不同的话,这实际上是不合逻辑的。)

    例如:如果表 T 成立使得语句T(...,T.Ci,...)为true且表 U 的行保存了使语句U(...,U) Cj,...)true然后table T JOIN U 包含声明T(...,T.Ci,...)和U(.. 。,U.Cj,...)是真的。这就是使用数据库很重要的 JOIN 的语义。你总是可以加入,而加入总是有意义的,它总是和它的操作数的意义相一致。是否有任何表碰巧对别人有FK只是对推理更新或查询不是特别有用。 (DBMS在制作错误时使用约束。)



    约束表达式与一个关于世界的命题并同时给一个关于基础表。例如,对于 C UNIQUE NOT NULL U 中,以下三个表达式相互等价:


    • FOREIGN KEY T(C)REFERENCES U(C) 除C之外的其他列 EXISTS T(...,C,...)

      IMPLIES除C以外的列 U(...,C,...)

    • (SELECT C FROM T)SUBSETOF(SELECT C FROM U)



    确实,这意味着 SELECT C FROM T JOIN U ON TC = UC = SELECT C FROM U ,即FK上的连接返回相同的行数。但是呢?连接的意义仍然是它的参数相同的功能。



    特定列集上的特定连接是否涉及外键与理解一个查询。


    Often you join two tables following their foreign key, so that the row in the RHS table will always be found. Adding the join does not affect the number of rows affected by the query. For example

    create table a (x int not null primary key)
    create table b (x int not null primary key, y int not null)
    alter table a add foreign key (x) references b (x)
    

    Now, assuming you set up some data in these two tables, you can get a certain number of rows from a:

    select x from a
    

    Adding a join to b following the foreign key does not change this:

    select a.x from a join b on a.x = b.x
    

    However, that is not true of joins in general, which may filter out some rows or (by Cartesian product) add more:

    select a.x from a join b on a.x = b.x and b.y != 42 -- probably gives fewer rows
    
    select a.x from a join b on a.x != b.y -- probably gives more rows
    

    When reading SQL code there is no obvious way to tell whether a join is the key-preserving kind, which may add extra columns but does not change the number of rows returned, or whether it has other effects. Over time I have developed a coding convention which I mostly stick to:

    • if a key-preserving join, use join
    • if wanting to filter rows, put the filter condition in the where clause
    • if wanting more rows, sometimes cross join for Cartesian product is the clearest way

    These are usually just style issues, since you can often put a predicate into either the join clause or the where clause, for example.

    My question

    Is there some way to have these key-preserving joins statically checked by the database server when the query is compiled? I understand that the query optimizer already knows that a join on a foreign key will always find exactly one row in the table pointed to by the foreign key. But I would like to tag it in my SQL code for the benefit of human readers. For example, suppose the new syntax fkjoin is used for a join following a foreign key. Then the following SQL fragments will give errors or not:

    a fkjoin b on a.x = b.x -- OK
    
    a fkjoin b on a.x = b.x and b.y = 42 -- "Error, join can fail due to extra predicate"
    
    a fkjoin b on a.x = b.y -- "Error, no foreign key from a.x to b.y"
    

    This would be a useful check for me when writing the SQL, and also when returning to read it later. I understand and accept that changing the foreign keys in the database would change what SQL is legal under this scheme - to me, that is a desired outcome, since if a necessary FK ceases to exist then the key-preserving semantics of the query are no longer guaranteed, and I'd like to find out about it.

    Potentially, there could be some external SQL static checker tool that does the work, and special comment syntax could be used rather than a new keyword. The checker tool would need access to the database schema to see what foreign keys exist, but it would not need to actually execute the query.

    Is there something that does what I want? I am using MSSQL 2008 R2. (Microsoft SQL Server for the pedantic)

    解决方案

    I realize that you are interested in indicating whether a particular join on particular columns is on a FK, or is a restriction, or perhaps is of some other case, or none of the preceding. (And it's not clear what you mean by "success" or "failure" of a join, or its relevance.) Whereas focusing on that information, as explained below, is to miss focusing on more important and fundamental things.

    A base table has a "meaning" or "predicate (expression)" that is a fill-in-the-(named-)blanks statement given by the DBA. The names of the blanks of the statement are the columns of the table. Rows that fill in the blanks to make a true proposition about the world go in the table. Rows that fill in the blanks to make a false proposition about the world are left out. Ie a table holds the rows that satisfy its statement. You cannot set a base table to a certain value without knowing its statement, observing the world and putting the appropriate rows into the table. You cannot know about the world from base tables except by knowing its statement and taking present-row propositions to be true and absent-row propositions to be false. Ie you need its statement to use the database.

    Notice that the typical syntax for a table declaration looks like a shorthand for its statement:

    -- employee [eid] is named [name] and lives at [address] in ...
    EMPLOYEE(eid,name,address,...)
    

    You can make bigger statements by putting logic operators AND, OR, AND NOT, EXISTS name, AND condition, etc between/around other statements. If you translate a statement to a relation/SQL expression by converting

    • a table's statement to its name
    • AND to JOIN
    • OR to UNION
    • AND NOT to EXCEPT/MINUS
    • EXISTS C,... [...] to SELECT all columns but C,... FROM ...
    • AND condition to ON/WHERE condition
    • IMPLIES to SUBSETOF
    • IFF to =

    then you get a relation expression that calculates the rows that make the statement true. (Arguments of UNION & EXCEPT/MINUS need the same columns.) So just as every table holds the rows satisfying its statement, a query expression holds the rows that satisfy its statement. You cannot know about the world from a query result except by knowing its statement and taking its present-row propositions to be true and absent-row propositions to be false. Ie you need its statement to compose or interpret a query. (Observe that this is true regardless of what constraints hold.)

    This is the foundation of the relational model: table expressions calculate rows satisfying corresponding statements. (To the extent that SQL differs, it is literally illogical.)

    Eg: If table T holds the rows that make statement T(...,T.Ci,...) true and table U holds the rows that make statement U(...,U.Cj,...) true then table T JOIN U holds the rows that make statement T(...,T.Ci,...) AND U(...,U.Cj,...) true. That is the semantics of JOIN that is important to using a database. You can always join, and a join always has a meaning, and it is always the AND of its operands' meanings. Whether any tables happen to have FKs to others just isn't particularly helpful for reasoning about updates or queries. (The DBMS uses constraints for when you make mistakes.)

    A constraint expression just corresponds to a proposition aka always-true statement about the world and simultaneusly to one about base tables. Eg for C UNIQUE NOT NULL in U, the following three expressions are equivalent to each other:

    • FOREIGN KEY T (C) REFERENCES U (C)
    •     EXISTS columns other than C T(...,C,...)
      IMPLIES EXISTS columns other than C U(...,C,...)
    • (SELECT C FROM T) SUBSETOF (SELECT C FROM U)

    It is true that this implies that SELECT C FROM T JOIN U ON T.C = U.C = SELECT C FROM U, ie a join on a FK returns the same number of rows. But so what? The join's meaning is still the same function of its arguments'.

    Whether a particular join on a particular column set involves a foreign key is just not germane to understanding the meaning of a query.

    这篇关于SQL连接以下外键:静态检查LHS是否保留键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆