从不同的用户会话列表中选择最早的日期和时间 [英] Select Earliest Date and Time from List of Distinct User Sessions

查看:102
本文介绍了从不同的用户会话列表中选择最早的日期和时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用户访问会话表,其中记录了网站访问者的活动:

I have a table of user access sessions which records website visitor activity:

accessid, userid, date, time, url

我正在尝试检索用户ID 1234的所有不同的会话,以及每个不同的会话的最早日期和时间.

I'm trying to retrieve all distinct sessions for userid 1234, as well as the earliest date and time for each of those distinct sessions.

SELECT 
    DISTINCT accessid, 
    date, 
    time 
FROM 
    accesslog 
WHERE userid = '1234' 
GROUP BY accessid

这为我提供了每个不同的accessid中随机行的日期和时间.我读过许多建议使用min()和max()的文章,所以我尝试了:

This gives me the date and time of a random row within each distinct accessid. I've read a number of posts recommending the use of min() and max(), so I tried:

SELECT DISTINCT accessid, MIN(DATE) AS date, MIN(TIME) AS time FROM accesslog WHERE userid = '1234' GROUP BY accessid ORDER BY date DESC, time DESC

...甚至...

SELECT DISTINCT accessid, MIN(CONCAT(DATE, ' ', TIME)) AS datetime FROM accesslog WHERE userid = '1234' GROUP BY accessid ORDER BY date DESC, time DESC

...但是我从来没有得到最早日期和时间的正确结果.

... but I never get the correct result of the earliest date and time.

订购这种查询的诀窍是什么?

What is the trick to ordering this kind of query?

编辑-

正在发生奇怪的事情....

Bill Karwin在下面发布的代码正确地检索了2009-09年开始的会话的最早日期和时间.但是,对于从2009-08某天开始的会话,返回的是当月发生的第一个匹配项的时间和日期.换句话说,查询似乎没有跨越几个月!

The code posted below by Bill Karwin correctly retrieves the earliest date and time for sessions that started in 2009-09. But, for sessions that began on some day in 2009-08, the time and date for the first hit occurring in the current month is what is returned. In other words, the query does not appear to be spanning months!

示例数据集:

accessid    | userid    | date          | time
1           | 1234      | 2009-08-15    | 01:01:01
1           | 1234      | 2009-09-01    | 12:01:01
1           | 1234      | 2009-09-15    | 13:01:01
2           | 1234      | 2009-09-01    | 14:01:01
2           | 1234      | 2009-09-15    | 15:01:01

至少在我的实际数据表上,下面发布的查询为两个访问ID的每个查找以下最早的日期和时间:

At least on my actual data table, the query posted below finds the follow earliest date and time for each of the two accessid's:

accessid    | userid    | date          | time
1           | 1234      | 2009-09-01    | 12:01:01
2           | 1234      | 2009-09-01    | 14:01:01

...,我猜想访问ID 2的结果显示正确的唯一原因是因为它在上个月没有命中.

... and I would guess that the only reason the result for accessid 2 appears correct is because it has no hits in a previous month.

我疯了吗?

编辑2-

答案是肯定的,我快疯了.当放置在重复结构表中时,该查询将对上述示例数据进行处理.

The answer is yes, I am going crazy. The query works on the above sample data when placed in a table of duplicate structure.

这是(截断的)原始数据.我包括了第一击,同月的另一击,下个月的第一击,然后是该月的最后击.原始数据集在这些点之间有更多匹配,总计462行.

Here is the (truncated) original data. I included the very first hit, another hit in the same month, the first hit of the next month, and then the last hit of the month. The original data set has many more hits in between these points, for a total of 462 rows.

accessid                            | date          | time
cbb82c08d3103e721a1cf0c3f765a842    | 2009-08-18    | 04:01:42
cbb82c08d3103e721a1cf0c3f765a842    | 2009-08-23    | 23:18:52
cbb82c08d3103e721a1cf0c3f765a842    | 2009-09-17    | 05:12:16
cbb82c08d3103e721a1cf0c3f765a842    | 2009-09-18    | 06:29:59

...查询原始表时,查询将返回2009-09-17值作为最早的值.但是,当我复制........哦,球.

... the query returns the 2009-09-17 value as the earliest value when the original table is queried. But, when I copy the ........ oh, balls.

这是因为2009-08%的匹配中有一个空的userid字段.

推荐答案

这是每周多次在StackOverflow上出现的最大组"问题的一种变体.

This is a variation of the "greatest-n-per-group" problem that comes up on StackOverflow several times per week.

SELECT 
        a1.accessid, 
        a1.date, 
        a1.time 
FROM 
        accesslog a1
LEFT OUTER JOIN
        accesslog a2
  ON (a1.accessid = a2.accessid AND a1.userid = a2.userid
    AND (a1.date > a2.date OR a1.date = a2.date AND a1.time > a2.time))
WHERE a1.userid = '1234'
  AND a2.accessid IS NULL;

此方法的工作方式是,我们尝试找到具有相同访问ID和用户ID,并且日期或时间比行A1早的行(a2).当我们找不到较早的行时,则必须 最早的行.

The way this works is that we try to find a row (a2) that has the same accessid and userid, and an earlier date or time than the row a1. When we can't find an earlier row, then a1 must be the earliest row.

关于您的评论,我只是使用您提供的示例数据进行了尝试.这就是我得到的:

Re your comment, I just tried it with the sample data you provided. Here's what I get:

+----------+------------+----------+
| accessid | date       | time     |
+----------+------------+----------+
|        1 | 2009-08-15 | 01:01:01 | 
|        2 | 2009-09-01 | 14:01:01 | 
+----------+------------+----------+

我在Mac OS X上使用MySQL 5.0.75.

I'm using MySQL 5.0.75 on Mac OS X.

这篇关于从不同的用户会话列表中选择最早的日期和时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆