在每个用户的最后一条记录之前的一个月内选择数据 [英] Select data within one month prior to each user's last record

查看:55
本文介绍了在每个用户的最后一条记录之前的一个月内选择数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个名为"Diary"的表,如下所示:

Assume I have a table called "Diary" like this:

| id | user_id |        recorded_at       | record |
|----|---------|--------------------------|--------|
| 20 |  50245  |2017-10-01 23:00:14.765366|   89   |
| 21 |  50245  |2017-12-05 10:00:33.135331|   97   |
| 22 |  50245  |2017-12-31 11:50:23.965134|   80   |
| 23 |  76766  |2015-10-06 11:00:14.902452|   70   |
| 24 |  76766  |2015-10-07 22:40:59.124553|   81   |

对于每个用户,我想检索最近的行以及在此之前一个月内的所有行.

换句话说,对于user_id 50245,我希望他/她的数据从"2017-12-01 11:50:23.965134"到"2017-12-31 11:50:23.965134";对于user_id 76766,我希望他/她的数据从"2015-09-07 22:40:59.124553"到"2015-10-07 22:40:59.124553".

In other words, for user_id 50245, I want the his/her data from "2017-12-01 11:50:23.965134" to "2017-12-31 11:50:23.965134"; for user_id 76766, I want his/her data from "2015-09-07 22:40:59.124553" to "2015-10-07 22:40:59.124553".

因此,所需的结果如下所示:

Hence the desired result looks like this:

| id | user_id |        recorded_at       | record |
|----|---------|--------------------------|--------|
| 21 |  50245  |2017-12-05 10:00:33.135331|   97   |
| 22 |  50245  |2017-12-31 11:50:23.965134|   80   |
| 23 |  76766  |2015-10-06 11:00:14.902452|   70   |
| 24 |  76766  |2015-10-07 22:40:59.124553|   81   |

请注意,id 20的记录不包含在 中,因为它比user_id 50245的最后一条记录早了一个月.

Please note that the record of id 20 is not included because it is more than one month prior to user_id 50245's last record.

有什么办法可以编写SQL查询来实现这一目标?

Is there any way I can write an SQL query to achieve this?

推荐答案

对于小型表,任何(有效的)查询技术都是好的.

For small tables, any (valid) query technique is good.

对于表,细节很重要.假设:

For big tables, details matter. Assuming:

  • 还有一个 users 表,其中 user_id 作为PK,包含所有相关用户(或可能还有更多).这是典型的设置.

  • There is also a users table with user_id as PK containing all relevant users (or possibly a few more). This is the typical setup.

您具有(或可以创建)日记的索引(user_id,record_at DESC NULLS LAST).如果在 NOT NULL 中定义了 recorded_at ,则 NULLS LAST 是可选的.但是请确保查询与索引匹配.

You have (or can create) an index on diary (user_id, recorded_at DESC NULLS LAST). NULLS LAST is optional if recorded_at is defined NOT NULL. But make sure the query matches the index.

每个用户多于几行-典型的用例.

More than a few rows per user - the typical use case.

这应该是最快的选择之一:

This should be among the fastest options:

SELECT d.*
FROM   users u
CROSS  JOIN LATERAL (
   SELECT recorded_at
   FROM   diary
   WHERE  user_id = u.user_id
   ORDER  BY recorded_at DESC NULLS LAST
   LIMIT 1
   ) d1
JOIN   diary d ON d.user_id = u.user_id
              AND d.recorded_at >= d1.recorded_at - interval '1 month'
ORDER  BY d.user_id, d.recorded_at;

准确地产生您想要的结果.

Produces your desired result exactly.

每个用户仅行,子查询中的 max() DISTINCT ON()通常更快.

For only few rows per user, max() or DISTINCT ON () in a subquery are typically faster.

相关(带有详细说明):

Related (with detailed explanation):

关于 FROM 子句:

这篇关于在每个用户的最后一条记录之前的一个月内选择数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆