SQL join:选择一对多关系中的最后一条记录 [英] SQL join: selecting the last records in a one-to-many relationship
问题描述
假设我有一张顾客表和一张购买表.每次购买都属于一个客户.我想在一个 SELECT
语句中获取所有客户的列表以及他们的最后一次购买.最佳做法是什么?关于建立索引有什么建议吗?
Suppose I have a table of customers and a table of purchases. Each purchase belongs to one customer. I want to get a list of all customers along with their last purchase in one SELECT
statement. What is the best practice? Any advice on building indexes?
请在您的答案中使用这些表/列名称:
Please use these table/column names in your answer:
- 客户:
id
,name
- 购买:
id
、customer_id
、item_id
、date
在更复杂的情况下,通过将最后一次购买放入客户表来对数据库进行非规范化是否(在性能方面)有益?
And in more complicated situations, would it be (performance-wise) beneficial to denormalize the database by putting the last purchase into the customer table?
如果(购买)id
保证按日期排序,是否可以使用类似LIMIT 1
的东西来简化语句?
If the (purchase) id
is guaranteed to be sorted by date, can the statements be simplified by using something like LIMIT 1
?
推荐答案
这是 StackOverflow 上经常出现的 greatest-n-per-group
问题的一个示例.
This is an example of the greatest-n-per-group
problem that has appeared regularly on StackOverflow.
以下是我通常建议的解决方法:
Here's how I usually recommend solving it:
SELECT c.*, p1.*
FROM customer c
JOIN purchase p1 ON (c.id = p1.customer_id)
LEFT OUTER JOIN purchase p2 ON (c.id = p2.customer_id AND
(p1.date < p2.date OR (p1.date = p2.date AND p1.id < p2.id)))
WHERE p2.id IS NULL;
说明:给定一行 p1
,应该没有行 p2
具有相同的客户和较晚的日期(或在平局的情况下,较晚的 >id
).当我们发现这是真的时,p1
是该客户最近购买的产品.
Explanation: given a row p1
, there should be no row p2
with the same customer and a later date (or in the case of ties, a later id
). When we find that to be true, then p1
is the most recent purchase for that customer.
关于索引,我会在 purchase
中在列(customer_id
、date
、id>)上创建一个复合索引代码>).这可能允许使用覆盖索引完成外连接.请务必在您的平台上进行测试,因为优化取决于实现.使用 RDBMS 的功能来分析优化计划.例如.
EXPLAIN
在 MySQL 上.
Regarding indexes, I'd create a compound index in purchase
over the columns (customer_id
, date
, id
). That may allow the outer join to be done using a covering index. Be sure to test on your platform, because optimization is implementation-dependent. Use the features of your RDBMS to analyze the optimization plan. E.g. EXPLAIN
on MySQL.
有些人使用子查询而不是我上面展示的解决方案,但我发现我的解决方案可以更轻松地解决关系.
Some people use subqueries instead of the solution I show above, but I find my solution makes it easier to resolve ties.
这篇关于SQL join:选择一对多关系中的最后一条记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!