“theta join"的清晰解释在关系代数? [英] Clear explanation of the "theta join" in relational algebra?

查看:98
本文介绍了“theta join"的清晰解释在关系代数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找对关系代数中 theta 连接概念的清晰、基本解释,以及一个示例(可能使用 SQL)来说明其用法.

I'm looking for a clear, basic explanation of the concept of theta join in relational algebra and perhaps an example (using SQL perhaps) to illustrate its usage.

如果我理解正确的话,theta join 是一个添加了条件的自然连接.因此,虽然自然连接强制同名属性之间的相等性(并删除重复?),theta join 做同样的事情事情,但增加了一个条件.我有这个权利吗?任何明确的解释,简单地说(对于非数学家),将不胜感激.

If I understand it correctly, the theta join is a natural join with a condition added in. So, whereas the natural join enforces equality between attributes of the same name (and removes the duplicate?), the theta join does the same thing but adds in a condition. Do I have this right? Any clear explanation, in simple terms (for a non-mathmetician) would be greatly appreciated.

另外(很抱歉把它放在最后,但它有点相关),有人可以解释笛卡尔积的重要性或想法吗?我想我在基本概念方面遗漏了一些东西,因为对我来说这似乎只是在重申一个基本事实,即一组 13 X 一组 4 = 52 ...

Also (sorry to just throw this in at the end, but its sort of related), could someone explain the importance or idea of cartesian product? I think I'm missing something with regard to the basic concept, because to me it just seems like a restating of a basic fact, i.e that a set of 13 X a set of 4 = 52...

推荐答案

暂且搁置 SQL...

Leaving SQL aside for a moment...

关系运算符将一个或多个关系作为参数并产生关系.因为根据定义关系没有重名的属性,关系操作 theta join 和 natural join 都将删除重复的属性".[按照您的要求,在 SQL 中发布示例来解释关系操作的一个大问题是,SQL 查询的结果不是关系,因为它可能有重复的行和/或列.]

A relational operator takes one or more relations as parameters and results in a relation. Because a relation has no attributes with duplicate names by definition, relational operations theta join and natural join will both "remove the duplicate attributes." [A big problem with posting examples in SQL to explain relation operations, as you requested, is that the result of a SQL query is not a relation because, among other sins, it can have duplicate rows and/or columns.]

关系笛卡尔积运算(产生关系)不同于集合笛卡尔积(产生一组对).笛卡尔"这个词在这里用处不大.事实上,Codd 称他的原始运算符为产品".

The relational Cartesian product operation (results in a relation) differs from set Cartesian product (results in a set of pairs). The word 'Cartesian' isn't particularly helpful here. In fact, Codd called his primitive operator 'product'.

真正的关系语言教程 D 缺少产品运算符,产品不是教程 D 的合著者 Hugh Darwen** 提出的关系代数中的原始运算符.这是因为没有共同属性名称的两个关系的自然连接导致与相同两个关系的乘积相同的关系,即自然连接更通用,因此更有用.

The truly relational language Tutorial D lacks a product operator and product is not a primitive operator in the relational algebra proposed by co-author of Tutorial D, Hugh Darwen**. This is because the natural join of two relations with no attribute names in common results in the same relation as the product of the same two relations i.e. natural join is more general and therefore more useful.

考虑这些示例(教程 D):

Consider these examples (Tutorial D):

WITH RELATION { TUPLE { Y 1 } , TUPLE { Y 2 } , TUPLE { Y 3 } } AS R1 ,
     RELATION { TUPLE { X 1 } , TUPLE { X 2 } } AS R2 :
R1 JOIN R2

返回关系的乘积,即二的度数(即两个属性,XY)和基数 6(2 x 3 = 6 个元组).

returns the product of the relations i.e. degree of two (i.e. two attributes, X and Y) and cardinality of 6 (2 x 3 = 6 tuples).

然而,

WITH RELATION { TUPLE { Y 1 } , TUPLE { Y 2 } , TUPLE { Y 3 } } AS R1 ,
     RELATION { TUPLE { Y 1 } , TUPLE { Y 2 } } AS R2 :
R1 JOIN R2

返回关系的自然连接,即1的度(即产生一个属性Y的属性的集合并集)和2的基数(即删除重复元组).

returns the natural join of the relations i.e. degree of one (i.e. the set union of the attributes yielding one attribute Y) and cardinality of 2 (i.e. duplicate tuples removed).

我希望上面的例子能解释为什么你的陈述一组 13 X 一组 4 = 52"不严格正确.

I hope the above examples explain why your statement "that a set of 13 X a set of 4 = 52" is not strictly correct.

同样,教程 D 不包含 theta 连接运算符.这主要是因为其他运算符(例如自然连接和限制)使其既不必要又不是非常有用.相比之下,Codd 的原始运算符包括可用于执行 theta 连接的乘积和限制.

Similarly, Tutorial D does not include a theta join operator. This is essentially because other operators (e.g. natural join and restriction) make it both unnecessary and not terribly useful. In contrast, Codd's primitive operators included product and restriction which can be used to perform a theta join.

SQL 有一个名为 CROSS JOIN 的显式乘积运算符,它强制结果为乘积,即使它通过创建重复列(属性)而违反 1NF.考虑与上面后面的教程 D 示例等效的 SQL:

SQL has an explicit product operator named CROSS JOIN which forces the result to be the product even if it entails violating 1NF by creating duplicate columns (attributes). Consider the SQL equivalent to the latter Tutoral D exmaple above:

WITH R1 AS (SELECT * FROM (VALUES (1), (2), (3)) AS T (Y)), 
     R2 AS (SELECT * FROM (VALUES (1), (2)) AS T (Y))
SELECT * 
  FROM R1 CROSS JOIN R2;

这将返回一个表表达式,其中包含两列(而不是一个属性),均称为 Y (!!) 和 6 行,即 this

This returns a table expression with two columns (rather than one attribute) both called Y (!!) and 6 rows i.e. this

SELECT c1 AS Y, c2 AS Y 
  FROM (VALUES (1, 1), 
               (2, 1), 
               (3, 1), 
               (1, 2), 
               (2, 2), 
               (3, 2)
       ) AS T (c1, c2);

<小时>

** 也就是说,虽然只有一个关系模型(即 Codd's),但可以有多个关系代数(即 Codd's 只有一个).


** That is, although there is only one relational model (i.e. Codd's), there can be more than one relational algebra (i.e. Codd's is but one).

这篇关于“theta join"的清晰解释在关系代数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆