仅选择具有最后修改的时间戳记的行,而不选择具有相同ID和较早时间戳记的重复行 [英] Select only rows with last modified timestamp, without duplicate rows with same ID and older timestamp

查看:62
本文介绍了仅选择具有最后修改的时间戳记的行,而不选择具有相同ID和较早时间戳记的重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我找不到解决我问题的方法.我有两个表Order和OrderDetail.

订单表 (简单版本)

| ID | modified  |
| 1  | 7.1.2018. |
| 2  | 10.1.2018.|
| 3  | 15.1.2018.|
| 4  | 20.1.2018.|
| 5  | 25.1.2018.|

OrderDetails (简单版本)

| order_id | detail_id | base_price | buy_price | sell_price|
| 1        | 1         | 99.00      | 111.00    | 122.00    |
| 1        | 2         | 82.00      | 95.00     | 117.00    | 
| 1        | 3         | 82.00      | 95.00     | 117.00    |
| 2        | 4         | 95.00      | 108.00    | 119.00    | 
| 2        | 5         | 86.00      | 94.00     | 115.00    | 
| 2        | 1         | 82.00      | 95.00     | 117.00    |
| 3        | 1         | 92.00      | 106.00    | 116.00    | 
| 3        | 4         | 90.00      | 100.00    | 120.00    | 
| 3        | 5         | 82.00      | 95.00     | 117.00    |
| 4        | 2         | 92.00      | 106.00    | 116.00    | 
| 4        | 3         | 90.00      | 100.00    | 120.00    | 
| 4        | 1         | 82.00      | 95.00     | 117.00    |
| 5        | 1         | 92.00      | 106.00    | 116.00    | 
| 5        | 5         | 90.00      | 100.00    | 120.00    | 
| 5        | 3         | 82.00      | 95.00     | 117.00    |

如何从OrderDetails表中获取与Order表中最后修改的时间戳相关联的行?

结果应为:

| order_id | detail_id | base_price | buy_price | sell_price | modified  |
| 5        | 1         | 92.00      | 106.00    | 116.00     | 25.1.2018.|
| 4        | 2         | 92.00      | 106.00    | 116.00     | 20.1.2018.|
| 5        | 3         | 82.00      | 95.00     | 117.00     | 25.1.2018.|
| 3        | 4         | 90.00      | 100.00    | 120.00     | 15.1.2018.|
| 5        | 5         | 90.00      | 100.00    | 120.00     | 25.1.2018.|

我知道要联接表,并从具有期望列的联接表中获取所有行,但是我不知道如何从每个 order_id,detail_id 对中仅过滤具有最新时间戳的行.拜托,我们将不胜感激.

编辑

Firebird 数据库需要查询.

编辑2.

第一个样本数据在某种程度上具有误导性.请再次查看扩展表和理想的结果. 我需要所有不同的行(基于"details_id")及其上次修改的数据.如何排除具有旧时间戳的每个"detail_id"的重复"行,并仅保留具有最新时间戳的"detail_id"行?

解决方案

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
 , mx as (select max(modified) as modified, detail_id
          from x group by detail_id)
Select x.* from x, mx
Where x.detail_id = mx.detail_id and x.modified=mx.modified

这里我们使用通用表表达式,因此我们仅将两个表连接一次. 至少在编写查询时,我们只做过一次-因此,我们出现错别字或复制粘贴错误的机会会更少. 我们还暗示SQL Server仅执行一次连接,然后重用它,但是它是否遵循此提示-取决于其内部实现.

关于CTE的另一件好事:它可以帮助您逐步地从简单到复杂地构建查询.在 https://en.wikipedia.org/wiki/REPL上了解有关阅读评估打印循环的信息. a>
我稍后会再添加一些.

您可以在Google中找到许多有关CTE的文章.此处记录了Firebird实施: https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-zh-CN/html/fblangref25-dml-select.html#fblangref25-dml-select-cte

由于我只使用了非常基本的SQL,所以我相信它可以在几乎所有实用的SQL服务器(包括Firebird)中使用.

这是查询和输出数据的结果: SQL提琴

PostgreSQL 9.6模式设置:

create table orders
 (id integer primary key,
  modified timestamp);
create index o_m on orders(modified);  

create table OrderDetails(
  order_id integer references orders(id),
  detail_id integer not null,
  base_price float,
  buy_price float,
  sell_price float );
create index od_do on OrderDetails(detail_id, order_id);

Insert into orders values
( 1, '2018-1-07'),
( 2, '2018-1-10'),
( 3, '2018-1-15'),
( 4, '2018-1-20'),
( 5, '2018-1-25');

Insert into OrderDetails values
(   1   ,   1   ,   99.00   ,   111.00  ,   122.00  ),
(   1   ,   2   ,   82.00   ,   95.00   ,   117.00  ),
(   1   ,   3   ,   82.00   ,   95.00   ,   117.00  ),
(   2   ,   4   ,   95.00   ,   108.00  ,   119.00  ),
(   2   ,   5   ,   86.00   ,   94.00   ,   115.00  ),
(   2   ,   1   ,   82.00   ,   95.00   ,   117.00  ),
(   3   ,   1   ,   92.00   ,   106.00  ,   116.00  ),
(   3   ,   4   ,   90.00   ,   100.00  ,   120.00  ),
(   3   ,   5   ,   82.00   ,   95.00   ,   117.00  ),
(   4   ,   2   ,   92.00   ,   106.00  ,   116.00  ),
(   4   ,   3   ,   90.00   ,   100.00  ,   120.00  ),
(   4   ,   1   ,   82.00   ,   95.00   ,   117.00  ),
(   5   ,   1   ,   92.00   ,   106.00  ,   116.00  ),
(   5   ,   5   ,   90.00   ,   100.00  ,   120.00  ),
(   5   ,   3   ,   82.00   ,   95.00   ,   117.00  );

查询1 :

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
 , mx as (select max(modified) as modified, detail_id
          from x group by detail_id)
Select x.* from x, mx
Where x.detail_id = mx.detail_id and x.modified=mx.modified
Order by detail_id

结果 :

|             modified | order_id | detail_id | base_price | buy_price | sell_price |
|----------------------|----------|-----------|------------|-----------|------------|
| 2018-01-25T00:00:00Z |        5 |         1 |         92 |       106 |        116 |
| 2018-01-20T00:00:00Z |        4 |         2 |         92 |       106 |        116 |
| 2018-01-25T00:00:00Z |        5 |         3 |         82 |        95 |        117 |
| 2018-01-15T00:00:00Z |        3 |         4 |         90 |       100 |        120 |
| 2018-01-25T00:00:00Z |        5 |         5 |         90 |       100 |        120 |

请注意,如果您有两个或多个带有相同时间戳的订单,它将具有不同的输出!看来您甚至都没有考虑过这种可能性-但是既然有可能,它将最终发生.

现在,返回CTE和REPL .

在逐步构建查询时,从第一个模糊的主意到特定的行,最好检查输出数据是否恰好符合您的期望. 大大象最好被小块吃掉."

在这里,我将向您展示逐步构建查询的过程. 如果您在上面链接的SQL Fiddle中重复这些步骤,将很有用.

首先,我创建并填充了表格.

然后,我发出第一个查询只是为了检查是否正确填充了它们.

1:select * from orders-在SQL小提琴(或IBExpert,FlameRobin等)中尝试此操作以及进一步的查询

2:select * from orderDetails

3:然后,我发出了联接查询,以检查我的交叉表查询是否确实给出了有意义的输出.是的.

select o.modified, od.* 
from orderDetails od, orders o
where o.id=od.order_id

4:然后我想知道,是否可以从该查询中获取最后一个时间戳以获取详细信息?要检查它的执行情况,请执行以下操作:1)保存我之前做的并经过测试的上述查询,以及2)在其上面编写一个辅助查询.它确实提取了最后的更改日期.编写并经过测试.

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
Select max(modified) as modified, detail_id
  from x group by detail_id

5:最后一步是也保存了测试二级查询,并在两个查询之上都编写了最终的第三级查询,从而给出了最终的过滤数据


更有效的解决方案可以是使用一次性运行联接查询(我在上面的第3步中引入的查询.,另存为x),先添加order by detail_id, modified desc,然后使用 Window Functions .

以下是使用该方法的类似问题的答案- Firebird从表的唯一字段中进行选择

窗口功能在Firebird 2.x中不可用.

I could not find solution for my problem. I have two tables Order and OrderDetail.

Order table (simple version)

| ID | modified  |
| 1  | 7.1.2018. |
| 2  | 10.1.2018.|
| 3  | 15.1.2018.|
| 4  | 20.1.2018.|
| 5  | 25.1.2018.|

OrderDetails (simple version)

| order_id | detail_id | base_price | buy_price | sell_price|
| 1        | 1         | 99.00      | 111.00    | 122.00    |
| 1        | 2         | 82.00      | 95.00     | 117.00    | 
| 1        | 3         | 82.00      | 95.00     | 117.00    |
| 2        | 4         | 95.00      | 108.00    | 119.00    | 
| 2        | 5         | 86.00      | 94.00     | 115.00    | 
| 2        | 1         | 82.00      | 95.00     | 117.00    |
| 3        | 1         | 92.00      | 106.00    | 116.00    | 
| 3        | 4         | 90.00      | 100.00    | 120.00    | 
| 3        | 5         | 82.00      | 95.00     | 117.00    |
| 4        | 2         | 92.00      | 106.00    | 116.00    | 
| 4        | 3         | 90.00      | 100.00    | 120.00    | 
| 4        | 1         | 82.00      | 95.00     | 117.00    |
| 5        | 1         | 92.00      | 106.00    | 116.00    | 
| 5        | 5         | 90.00      | 100.00    | 120.00    | 
| 5        | 3         | 82.00      | 95.00     | 117.00    |

How I can get rows from OrderDetails table which are connected with last modified timestamp from Order table?

Result should be:

| order_id | detail_id | base_price | buy_price | sell_price | modified  |
| 5        | 1         | 92.00      | 106.00    | 116.00     | 25.1.2018.|
| 4        | 2         | 92.00      | 106.00    | 116.00     | 20.1.2018.|
| 5        | 3         | 82.00      | 95.00     | 117.00     | 25.1.2018.|
| 3        | 4         | 90.00      | 100.00    | 120.00     | 15.1.2018.|
| 5        | 5         | 90.00      | 100.00    | 120.00     | 25.1.2018.|

I know to join tables, and to get all rows from joint tables with desired columns, but I don't know how to filter only those rows with latest timestamp from every order_id, detail_id pair. Please, any help would be really appreciated.

Edit

Query is needed for Firebird database.

Edit 2.

First sample data was somehow misleading. Please take a look again on extended tables and desirable result. I need all distinct rows (based on "details_id") with it's last modified data. How to exclude "duplicate" rows for every "detail_id" with it's older timestamp and only keep "detail_id" rows with latest timestamp???

解决方案

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
 , mx as (select max(modified) as modified, detail_id
          from x group by detail_id)
Select x.* from x, mx
Where x.detail_id = mx.detail_id and x.modified=mx.modified

Here we use Common Table Expressions, so we join the two tables only once. At least we did it only once when writing the query - thus we would have less chances to do typo or copy-paste errors. We also hinted the SQL server to do the join only once and then to reuse it, but would it follow this hint or not - depends on its internal implementation.

Another good thing about CTE: it helps you to build your query incrementally, step by step, from simple to complex. Read about Read–eval–print loop at https://en.wikipedia.org/wiki/REPL
I would add a bit more of it later.

You can find many articles about CTE in Google. The Firebird implementation is documented here: https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-dml-select.html#fblangref25-dml-select-cte

Since I used only very basic SQL I believe it to work in almost any practical SQL server, Firebird including.

Here is the result of the query and the output data: SQL Fiddle

PostgreSQL 9.6 Schema Setup:

create table orders
 (id integer primary key,
  modified timestamp);
create index o_m on orders(modified);  

create table OrderDetails(
  order_id integer references orders(id),
  detail_id integer not null,
  base_price float,
  buy_price float,
  sell_price float );
create index od_do on OrderDetails(detail_id, order_id);

Insert into orders values
( 1, '2018-1-07'),
( 2, '2018-1-10'),
( 3, '2018-1-15'),
( 4, '2018-1-20'),
( 5, '2018-1-25');

Insert into OrderDetails values
(   1   ,   1   ,   99.00   ,   111.00  ,   122.00  ),
(   1   ,   2   ,   82.00   ,   95.00   ,   117.00  ),
(   1   ,   3   ,   82.00   ,   95.00   ,   117.00  ),
(   2   ,   4   ,   95.00   ,   108.00  ,   119.00  ),
(   2   ,   5   ,   86.00   ,   94.00   ,   115.00  ),
(   2   ,   1   ,   82.00   ,   95.00   ,   117.00  ),
(   3   ,   1   ,   92.00   ,   106.00  ,   116.00  ),
(   3   ,   4   ,   90.00   ,   100.00  ,   120.00  ),
(   3   ,   5   ,   82.00   ,   95.00   ,   117.00  ),
(   4   ,   2   ,   92.00   ,   106.00  ,   116.00  ),
(   4   ,   3   ,   90.00   ,   100.00  ,   120.00  ),
(   4   ,   1   ,   82.00   ,   95.00   ,   117.00  ),
(   5   ,   1   ,   92.00   ,   106.00  ,   116.00  ),
(   5   ,   5   ,   90.00   ,   100.00  ,   120.00  ),
(   5   ,   3   ,   82.00   ,   95.00   ,   117.00  );

Query 1:

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
 , mx as (select max(modified) as modified, detail_id
          from x group by detail_id)
Select x.* from x, mx
Where x.detail_id = mx.detail_id and x.modified=mx.modified
Order by detail_id

Results:

|             modified | order_id | detail_id | base_price | buy_price | sell_price |
|----------------------|----------|-----------|------------|-----------|------------|
| 2018-01-25T00:00:00Z |        5 |         1 |         92 |       106 |        116 |
| 2018-01-20T00:00:00Z |        4 |         2 |         92 |       106 |        116 |
| 2018-01-25T00:00:00Z |        5 |         3 |         82 |        95 |        117 |
| 2018-01-15T00:00:00Z |        3 |         4 |         90 |       100 |        120 |
| 2018-01-25T00:00:00Z |        5 |         5 |         90 |       100 |        120 |

Note, it would have different output if you would have two or more orders with the same timestamp! It seems you did not even thought about this possibility - but since it is possible, it would eventually happen.

Now, Back to CTE and REPL.

As you incrementally build your query, from the first vague idea to specific rows, it would be nice to check that the output data is exactly what you expect. "Big elephant it better to be eaten by small pieces".

And here I will show you the step by step building of the query. It would be useful if you would repeat those steps in the SQL Fiddle linked above.

First of all I created and populated the tables.

Then I issued the first queries just to check I did populated them correctly.

1: select * from orders - try this and further queries in SQL fiddle (or in IBExpert, FlameRobin, etc

2: select * from orderDetails

3: Then I issued the joining query to check if my cross-tables query really gives meaningful output. It did.

select o.modified, od.* 
from orderDetails od, orders o
where o.id=od.order_id

4: Then I wondered, can I get the last timestamp for detail out of that query? To check it what I did was the following: 1) saved the aforementioned query I did and tested earlier, and 2) wrote a secondary query on top of it. It did extracting the last change date okay. Written and tested.

with x as (select o.modified, od.* 
           from orderDetails od, orders o
           where o.id=od.order_id)
Select max(modified) as modified, detail_id
  from x group by detail_id

5: And the last step was saving the test secondary query too, and writing the final, tertiary query on top of them both, giving the final filtered data


Yet more efficient solution can be using the one-run join query (the one I introduced above in Step 3. and saved as x) with adding order by detail_id, modified desc and then using Window Functions introduced in Firebird 3.

Here is the answer to a similar question using that approach - Firebird select from table distinct one field

Window Functions are not available in Firebird 2.x though.

这篇关于仅选择具有最后修改的时间戳记的行,而不选择具有相同ID和较早时间戳记的重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆