这个查询难免复杂吗? [英] Is this query irreducibly complex?
问题描述
我有两个MySQL数据库表,如下所述.一个表保存设备信息,而另一个表则是有关每个设备的一对多日志.
I have two MySQL database tables, described below. One table holds device information, and the other is a one-to-many log about each device.
CREATE TABLE `device` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
`name` VARCHAR(255) NOT NULL,
`active` INT NOT NULL DEFAULT 1,
INDEX (`active`)
);
CREATE TABLE `log` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
`device_id` INT NOT NULL,
`message` VARCHAR(255) NOT NULL,
`when` DATETIME NOT NULL,
INDEX (`device_id`)
);
我想做的是在单个查询中(如果可能)获取设备信息以及每个设备的最新日志条目.到目前为止,我所拥有的是以下内容:
What I want to do is grab device information along with the latest log entry for each device in a single query (if possible). So far, what I have is the following:
SELECT d.id, d.name, l.message
FROM device AS d
LEFT JOIN (
SELECT l1.device_id, l1.message
FROM log AS l1
LEFT JOIN log AS l2 ON (l1.device_id = l2.device_id AND l1.when < l2.when)
WHERE l2.device_id IS NULL
) AS l ON (d.id = l.device_id)
WHERE d.active = 1
GROUP BY d.id
ORDER BY d.id ASC;
这些查询是我实际设置的简化复制品,其中我的日志表超过10万行(实际上我查看的是几个日志表).该查询确实运行,但是非常非常慢(例如,超过两分钟).我相信,可以使用一种更简洁/优雅/"SQL"的方式来构成此查询以获取所需的数据,但是我还没有找到它.
These queries are simplified reproductions of my actual setup, where my log table is over 100k rows (and there are actually several log tables I look at). The query does run, however very, very slowly (say, more than two minutes). I'm convinced that there is a more concise/elegant/"SQL" way to form this query to get the data I need, but I just haven't found it yet.
在没有丑陋的sub-SELECT和self-JOIN的情况下,我什至想做些什么吗?我可以使用其他策略来完成工作吗?或者,查询的本质是不可简化的吗?
Is what I want to do even possible without the ugly sub-SELECT and self-JOIN? Can I get the job done with a different strategy? Or, is the very nature of the query something that is irreducibly complex?
同样,应用程序逻辑是这样的,如果这行不通,我可以手动联接"表,但是我觉得MySQL应该能够处理这样的事情而不会阻塞-但我很肯定是绿色的这种复杂的集合代数.
Again, the application logic is such that I can "manually JOIN" the tables if this isn't going to work, but I feel like MySQL should be able to handle something like this without choking - but I'm admittedly green when it comes to this kind of complex set algebra.
编辑:由于这是一个人为的示例,因此我忘记了将索引添加到device.active
As this is a contrived example, I'd forgotten to add the index to device.active
推荐答案
以下是用于避免自联接的查询方法:
Here's a slightly different approach to your query that avoids the self-join:
SELECT d.id, d.name, l.message
FROM device AS d
LEFT JOIN (
SELECT l1.device_id, l1.message
FROM log AS l1
WHERE l1.when = (
SELECT MAX(l2.when)
FROM log AS l2
WHERE l2.device_id = l1.device_id
) l ON l.device_id = d.id
WHERE d.active = 1
ORDER BY d.id ASC;
由于100k的表不是很大,即使没有适当的索引,我也不希望此查询花费超过几秒钟的时间.但是,如评论所建议的那样,您可以考虑根据explain plan
的结果添加其他索引.
Since 100k isn't a very large table, even without the proper indexes I wouldn't expect this query to take more than a few seconds. However, like the comments suggest, you might consider adding additional indexes based on the results of your explain plan
.
这篇关于这个查询难免复杂吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!