来自多个表的复杂SUM [英] Complex SUM from multiple tables

查看:119
本文介绍了来自多个表的复杂SUM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面是我的表:

  CREATE TABLE组件
(id INTEGER PRIMARY KEY AUTOINCREMENT,
名称TEXT UNIQUE);

CREATE TABLE文件
(id INTEGER PRIMARY KEY AUTOINCREMENT,
component_id INTEGER,
name TEXT UNIQUE);

CREATE TABLE函数
(id INTEGER PRIMARY KEY AUTOINCREMENT,
file_id INTEGER,
名称TEXT,
FOREIGN KEY(file_id)REFERENCES file(id) ,
UNIQUE(file_id,name));

CREATE TABLE版本
(id INTEGER PRIMARY KEY AUTOINCREMENT,
version TEXT UNIQUE);

CREATE TABLE数据
(id INTEGER PRIMARY KEY AUTOINCREMENT,
file_id INTEGER,
version_id INTEGER,
function_id INTEGER,
错误INTEGER,
...,
FOREIGN KEY(file_id)REFERENCES file(id),
FOREIGN KEY(version_id)REFERENCES版本(id),
FOREIGN KEY(function_id)REFERENCES函数id),
UNIQUE(file_id,version_id,function_id));

我需要两个查询:


  • 一个用于汇总文件中所有数据的data.errors。对于给定的文件ID,我需要所有错误的总和。

  • 一个用于汇总特定组件内所有文件的所有函数的data.errors。
  • 所有data.errors都必须属于最新的版本号。



以上版本MAX要求的示例: / p>

  DATA 
id file_id version_id function_id错误
1 1 3 1 40
2 1 3 2 231
3 1 2 3 19

这里我需要它返回ids 1 ,2并忽视3 ,即使它是特定功能的最新版本。它与属于该文件的功能的最新版本匹配。设想一个真实的世界场景,其中一个函数从新版本的文件中删除。



唯一的要求是查询速度尽可能快。
数据库中的约束没有太多变化(最好什么也没有)。
如果在Django ORM中可以做到这一点,我打算使用它,那会很好,但并不是必需的。

解决方案可以这样计算文件的最新版本:

  SELECT MAX(version_id)
从数据
WHERE file_id =?

这可以简单地插入到另一个查询中以获得总和:

  SELECT SUM(错误)
FROM数据
WHERE file_id =?
AND version_id =(SELECT MAX(version_id)
FROM data
WHERE file_id =?)

为了扩展到组件,需要另一个子查询来查找组件的文件:

  SELECT SUM(errors)
FROM data
WHERE file_id IN(SELECT id
FROM file
WHERE component_id =?)
AND version_id =(SELECT MAX(version_id)
FROM data
WHERE file_id IN(SELECT id
FROM file
WHERE component_id =?))


Here are my tables:

CREATE TABLE component
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            name TEXT UNIQUE);

CREATE TABLE file
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            component_id INTEGER,
                            name TEXT UNIQUE);

CREATE TABLE function
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            file_id INTEGER,
                            name TEXT,
                            FOREIGN KEY(file_id) REFERENCES file(id),
                            UNIQUE(file_id, name));

CREATE TABLE version
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            version TEXT UNIQUE);

CREATE TABLE data
                            (id INTEGER PRIMARY KEY AUTOINCREMENT,
                            file_id INTEGER,
                            version_id INTEGER,
                            function_id INTEGER,
                            errors INTEGER,
                            ...,
                            FOREIGN KEY(file_id) REFERENCES file(id),
                            FOREIGN KEY(version_id) REFERENCES version(id),
                            FOREIGN KEY(function_id) REFERENCES function(id),
                            UNIQUE(file_id, version_id, function_id));

I need two queries:

  • One to SUM the data.errors for all data in a file. For a given file id I need the total sum of all errors.
  • One to SUM the data.errors for all functions for all files inside a specific component.
  • ALL of the data.errors MUST belong to the most recent version_id.

Example of the version MAX requirement above:

DATA
id  file_id     version_id  function_id     errors
1       1           3           1           40
2       1           3           2           231
3       1           2           3           19

Here I need it to return ids 1,2 and disregard 3 even if it is the most recent version for a specific function. It does match with the most recent version for the the functions belonging to that file. Imagine a real world scenario where a function is removed from a file in a new version.

The only requirement is that the query is as fast as it can be. The constraints are not changing too much in the database (preferably nothing at all). If this is possible to do in Django ORM, where I intend to use it, that would be great but it's not required.

解决方案

The most recent version of a file can be computed like this:

SELECT MAX(version_id)
FROM data
WHERE file_id = ?

This can simply be plugged into another query to get the sum:

SELECT SUM(errors)
FROM data
WHERE file_id = ?
  AND version_id = (SELECT MAX(version_id)
                    FROM data
                    WHERE file_id = ?)

To extend this to a component, another subquery is needed to look up the component's files:

SELECT SUM(errors)
FROM data
WHERE file_id IN (SELECT id
                  FROM file
                  WHERE component_id = ?)
  AND version_id = (SELECT MAX(version_id)
                    FROM data
                    WHERE file_id IN (SELECT id
                                      FROM file
                                      WHERE component_id = ?))

这篇关于来自多个表的复杂SUM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆