如何编码深度嵌套的 JOIN? [英] How to code deeply nested JOIN?

查看:41
本文介绍了如何编码深度嵌套的 JOIN?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有五个表,我们称它们为 A 到 E,每个表都有一个唯一键和一个外键,外键是 parent 表的唯一键.

因此:A->B->C->D->E

A 有字段 A_keyB_key;B 有键 B_keyC_key,依此类推...

给定A_key,我如何SELECT * from E?这让我很难过:-(

我需要嵌套的 JOIN 还是什么?

为什么系统不让我提交问题而不在最后添加这个?为什么说它包含不正确的语法?

<小时>

[更新]

我刚刚意识到我用错了措辞.

它是 A->B->C->D<-E - 一切都指向向下",直到最后一个指向向上".

我不知道如何更改数据库.有什么办法可以调整@GiorgosBetsos 的回答吗?

根据评论中的要求,这里是表格.给定一个campaign_id,我想SELECT * FROM taps

mysql>描述活动;+--------------+------------+------+-----+---------+----------------+|领域 |类型 |空 |钥匙 |默认 |额外 |+--------------+------------+------+-----+---------+----------------+|广告系列_id |整数(11) |否 |PRI |空 |自动增量||标题 |文字 |否 ||空 |||说明 |文字 |是 ||空 |||path_to_logo |文字 |是 ||空 |||开始时间 |文字 |否 ||空 |||结束时间 |文字 |否 ||空 |||暂停 |tinyint(1) |否 ||0 ||+--------------+------------+------+-----+---------+----------------+7 行(0.00 秒)mysql>描述目的地;+----------------+---------------+------+-----+---------+------+|领域 |类型 |空 |钥匙 |默认 |额外 |+----------------+---------------+------+-----+---------+------+|目的地_id |整数(11) |否 |PRI |空 |自动增量||广告系列_id |整数(11) |否 |PRI |空 |||网址 |varchar(2048) |否 ||空 |||说明 |varchar(2048) |否 ||空 ||+----------------+---------------+------+-----+---------+------+4 行(0.00 秒)mysql>描述联系;+----------------+---------+------+-----+---------+----------------+|领域 |类型 |空 |钥匙 |默认 |额外 |+----------------+---------+------+-----+---------+----------------+|connection_id |整数(11) |否 |多|空 |自动增量||目的地_id |整数(11) |否 ||空 |||说明 |文字 |否 ||空 |||tag_id |整数(11) |是 ||空 |||国家 |文字 |是 ||空 |||县 |文字 |是 ||空 |||镇 |文字 |是 ||空 |||邮政编码 |文字 |是 ||空 |||定制 |文字 |是 ||空 ||+----------------+---------+------+-----+---------+----------------+9 行(0.00 秒)mysql>描述标签;+--------------+---------+------+-------+---------+-------+|领域 |类型 |空 |钥匙 |默认 |额外 |+--------------+---------+------+-------+---------+-------+|tag_id |整数(11) |否 |PRI |空 |||标签类型 |整数(11) |否 ||空 |||customer_id |整数(11) |否 |PRI |空 |||connection_id |整数(11) |否 |PRI |空 ||+--------------+---------+------+-------+---------+-------+4 行(0.00 秒)mysql>描述水龙头;+------------+------------+------+-----+----------------+-------+|领域 |类型 |空 |钥匙 |默认 |额外 |+------------+------------+------+-----+----------------+-------+|tag_id |整数(11) |否 ||空 |||时间戳 |时间戳 |否 ||CURRENT_TIMESTAMP |||device_id |文字 |是 ||空 |||设备类型 |文字 |是 ||空 ||+------------+------------+------+-----+----------------+-------+4 行(0.00 秒)

解决方案

以下查询将让您选择 taps 表中的所有列:

SELECTtaps.tag_id,taps.time_stamp,taps.device_id,taps.device_typeFROM 活动 AS 阵营INNER JOIN 目的地 AS destON dest.campaign_id = camp.campaign_idINNER JOIN 连接 AS connON conn.destination_id = dest.destination_id内连接标签ON tags.connection_id = conn.connection_idINNER JOIN 水龙头ON taps.tag_id = tags.tag_id哪里camp.campaign_id = :campaign_id

你可以用 taps.* 替换 SELECT 之后的 4 行,但是使用 * 是不好的做法,就像你曾经添加过一样以后的任何列,您将开始得到不同的结果.

此外,这个查询应该比 Giorgos 提供的查询更有效,因为 EXISTS 通常不会像 JOIN 那样优化.>

(我发誓我不会跟踪你并试图找出你的答案,Giorgos,我碰巧遇到了和你一样的两个问题,并认为我可以提供更好的解决方案.)

从您描述您遇到的问题的方式来看,关于这是一个深度嵌套的 JOIN"表明您正在错误地查看数据库关系,因为许多人(包括我自己)在对数据库还相对缺乏经验时首先会这样做.

集合思维简介

数据库将采用上面的 INNER JOIN 列表,并决定从一个开始.在这种情况下,它将从 campaigns 表开始,因为它知道它可以从该表中获得最小的结果(一行,或者没有,如果没有匹配 campaign_id>.)

从那里开始,它将建立那一行,并且对于destinations表中的每条记录,它会找到所有具有匹配行ON<代码>campaign_id 列.从那里,它会将 destination_id 添加到其结果中,对应 destinations 表中匹配的每一行.

然后,对于其结果中的每一行(假设有 3 个目的地匹配),它会找到与它在结果集中具有相同 destination_id 的所有连接的匹配项.然后将 connection_id 添加到其结果中,对于每个给定的 destination_id,连接表中的每一行一个.假设 3 个目的地分别有 0、1 和 4 个连接.这意味着数据库现在保存了 5 行的结果设置(没有任何 connectionsdestination_id 从结果中删除,因为我们使用了 INNER JOINs.)

该过程沿线重复,直到最终到达 taps 表,并将该表中的所有信息添加到结果中.最后,它只从它持有的信息中抓取你想要的列,并从 taps 表中返回行.

但重要的是,您没有告诉它工作的顺序.从数据库的角度来看,以下查询是等效的:

SELECTtaps.tag_id,taps.time_stamp,taps.device_id,taps.device_type从水龙头内连接标签ON taps.tag_id = tags.tag_idINNER JOIN 连接 AS connON tags.connection_id = conn.connection_idINNER JOIN 目的地 AS destON conn.destination_id = dest.destination_idINNER JOIN 活动 AS 阵营ON dest.campaign_id = camp.campaign_id哪里camp.campaign_id = :campaign_id

请注意,虽然从我们的角度来看顺序是相反的,但数据库仍将按照与上述相同的顺序运行.这是因为数据库希望从最小的一组行开始,然后一次将一个表添加到结果中.

您不必担心顺序或任何嵌套.只需担心通过 ON 语句告诉数据库如何比较表.您甚至可以按您想要的几乎任何顺序放置表格(在本例中),只要任何 ON 语句仅引用已定义的表格(从我们的阅读角度,从上到下.)

我的第二个查询实际上可能更容易理解集合概念.从您想要的信息(taps 表)作为 FROM 子句开始.然后开始JOINing 表,使您更接近WHERE 子句中的信息,直到您为数据库引擎提供了足够的信息来将事情拼凑起来并获取您需要的信息.

希望这有助于您开始思考正确的道路(就像有人为我所做的那样,大约 5 年前).如果您有任何问题,请随时提出,我会用这些来改进我的解释.

I have five tables, let's call them A through E, each of which has a unique key and a foreign key which is the unique key of the parent table.

Thus: A->B->C->D->E

A has fields A_key and B_key; B has keys B_key and C_key, and so on ...

Given A_key, how do I SELECT * from E? This has me stumped :-(

Do I need nested JOINs, or what?

And why won't the system let me submit the question without adding this at the end? Why does it say that it contains improper grammar?


[Update]

I have just realized that I phrased it wrongly.

It is A->B->C->D<-E - everything points "downwards", until the last, which points "upwards".

I don't see how I can change the database. Is there any way to tweak @GiorgosBetsos answer?

As requested in comment, here are the tables. Given a campaign_id, I want to SELECT * FROM taps

mysql> describe campaigns;
+--------------+------------+------+-----+---------+----------------+
| Field        | Type       | Null | Key | Default | Extra          |
+--------------+------------+------+-----+---------+----------------+
| campaign_id  | int(11)    | NO   | PRI | NULL    | auto_increment |
| title        | text       | NO   |     | NULL    |                |
| description  | text       | YES  |     | NULL    |                |
| path_to_logo | text       | YES  |     | NULL    |                |
| start_time   | text       | NO   |     | NULL    |                |
| end_time     | text       | NO   |     | NULL    |                |
| paused       | tinyint(1) | NO   |     | 0       |                |
+--------------+------------+------+-----+---------+----------------+
7 rows in set (0.00 sec)

mysql> describe destinations;
+----------------+---------------+------+-----+---------+----------------+
| Field          | Type          | Null | Key | Default | Extra          |
+----------------+---------------+------+-----+---------+----------------+
| destination_id | int(11)       | NO   | PRI | NULL    | auto_increment |
| campaign_id    | int(11)       | NO   | PRI | NULL    |                |
| url            | varchar(2048) | NO   |     | NULL    |                |
| description    | varchar(2048) | NO   |     | NULL    |                |
+----------------+---------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> describe connections;
+----------------+---------+------+-----+---------+----------------+
| Field          | Type    | Null | Key | Default | Extra          |
+----------------+---------+------+-----+---------+----------------+
| connection_id  | int(11) | NO   | MUL | NULL    | auto_increment |
| destination_id | int(11) | NO   |     | NULL    |                |
| description    | text    | NO   |     | NULL    |                |
| tag_id         | int(11) | YES  |     | NULL    |                |
| country        | text    | YES  |     | NULL    |                |
| county         | text    | YES  |     | NULL    |                |
| town           | text    | YES  |     | NULL    |                |
| post_code      | text    | YES  |     | NULL    |                |
| custom         | text    | YES  |     | NULL    |                |
+----------------+---------+------+-----+---------+----------------+
9 rows in set (0.00 sec)

mysql> describe tags;
+---------------+---------+------+-----+---------+-------+
| Field         | Type    | Null | Key | Default | Extra |
+---------------+---------+------+-----+---------+-------+
| tag_id        | int(11) | NO   | PRI | NULL    |       |
| tag_type      | int(11) | NO   |     | NULL    |       |
| customer_id   | int(11) | NO   | PRI | NULL    |       |
| connection_id | int(11) | NO   | PRI | NULL    |       |
+---------------+---------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> describe taps;
+-------------+-----------+------+-----+-------------------+-------+
| Field       | Type      | Null | Key | Default           | Extra |
+-------------+-----------+------+-----+-------------------+-------+
| tag_id      | int(11)   | NO   |     | NULL              |       |
| time_stamp  | timestamp | NO   |     | CURRENT_TIMESTAMP |       |
| device_id   | text      | YES  |     | NULL              |       |
| device_type | text      | YES  |     | NULL              |       |
+-------------+-----------+------+-----+-------------------+-------+
4 rows in set (0.00 sec)

解决方案

The following query will let you select all the columns from the taps table:

SELECT
taps.tag_id,
taps.time_stamp,
taps.device_id,
taps.device_type
FROM campaigns AS camp
INNER JOIN destinations AS dest
ON dest.campaign_id = camp.campaign_id
INNER JOIN connections AS conn
ON conn.destination_id = dest.destination_id
INNER JOIN tags 
ON tags.connection_id = conn.connection_id
INNER JOIN taps
ON taps.tag_id = tags.tag_id
WHERE camp.campaign_id = :campaign_id

You CAN replace the 4 rows after the SELECT with taps.*, but using * is bad practice, as if you ever add any columns in the future, you will start getting back different results.

Also, this query should be more efficient than the one provided by Giorgos, as EXISTS will, in general, not optimize as well as a JOIN will.

(I swear I'm not following you around and trying to one up your answers, Giorgos, I just happened to run across 2 of the same questions as you and thought I could provide a better solution.)

From how you describe the problem you are having, about this being a "deeply nested JOIN" suggests that you are viewing database relations incorrectly, as many, including myself, first do when still relatively inexperienced with databases.

Introduction to thinking with sets

The database is going to take the list of INNER JOINs above, and decide on one to start with. In this case, it will start with the campaigns table, because it knows it can get the smallest result back from that table (one row, or none, if none match the campaign_id.)

From there, it will build off of that one row, and for each record in the destinations table, it will find all those with a matching row ON the campaign_id column. From there, it will add the destination_id to its results, one for each row in the destinations table that was matched.

Then, for each row in its results (let's say there were 3 destinations match), it finds the matches all the connections with the same destination_id as it has in its result set. It then adds the connection_id to its results, one for each row in the connections table, for each given destination_id. Let's say the 3 destinations had 0, 1, and 4 connections, each. This means the database is now holding a result set up 5 rows (the destination_id without any connections was dropped from the result, because we are using INNER JOINs.)

The process repeats down the line, until it finally reaches the taps table, and all the information from that table is added to the result. Finally, it grabs just the columns you want from the information it is holding on to, and returns the rows from the taps table.

Importantly though, you don't have tell it the order to work in. The following query is equivalent, from the database's point of view:

SELECT
taps.tag_id,
taps.time_stamp,
taps.device_id,
taps.device_type
FROM taps
INNER JOIN tags
ON taps.tag_id = tags.tag_id
INNER JOIN connections AS conn
ON tags.connection_id = conn.connection_id
INNER JOIN destinations AS dest
ON conn.destination_id = dest.destination_id
INNER JOIN campaigns AS camp
ON dest.campaign_id = camp.campaign_id
WHERE camp.campaign_id = :campaign_id

Note, that while the order is reversed from our perspective, the database will still operate in the same order as described above. That is because the database is looking to start with the smallest set of rows, and then add one table at a time to the result.

You do NOT have to worry about the order, or any nesting. Worry only about telling the database how to compare the tables, via the ON statements. You can even put the tables (in this example) in pretty much any order you want, so long as any ON statements only refer to already defined tables (from our reading perspective, top to bottom.)

The second query of mine may actually be easier to understand the sets concept. Start with the information you want (the taps table) as the FROM clause. Then start JOINing tables that get you closer to the information you have in the WHERE clause, until you have provided enough information for the database engine to piece things together and grab the information you need.

Hopefully that is helpful to get you starting thinking down the right path (as someone did for me, around 5 years ago). If you have any questions, feel free to ask, and I'll use those to improve my explanation.

这篇关于如何编码深度嵌套的 JOIN?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆