为数据库中的行分配访问号？ [英] Assigning a visit number to rows in a database?

查看：66 发布时间：2020/5/30 1:52:40 sql postgresql

本文介绍了为数据库中的行分配访问号？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

尝试搜索时没有运气，也许我只是没有正确使用术语。无论如何，我正在使用postgresql 8.1数据库（redshift），其中的数据示例如下所示：

Tried searching around with no luck, maybe I'm just not using right the terminology. Anyway, I'm working with a postgresql 8.1 database (redshift) where a sample of the data would look like this:

User ID   Timestamp    Visit ID
1         05:20:20am   10
2         05:21:00am   11
1         05:22:14am   12
1         05:24:09am   13

假设这是数据的非常基本的样本。我想分析人们在第一次访问，第二次访问，第三次访问等时如何使用其他指标。我认为最简单的方法是创建一个单独的表，如下所示：

Let's say this is a very basic sample of the data. I want to do analysis on how people do on other metrics when its their 1st visit, 2nd visit, 3rd visit, etc. The easiest way I can think to do this would be to create either a seperate table that looks like:

Visit ID    Visit Number
10          1
11          1
12          2 
13          3

或者以某种方式附加到我的原始表中：

Or somehow appending to my original table:

User ID   Timestamp    Visit ID    Visit Number
1         05:20:20am   10          1
2         05:21:00am   11          1 
1         05:22:14am   12          2
1         05:24:09am   13          3

我知道我可以使用groupby并计算访问次数至少为x的访客与访问次数较少的访客的分析表现。但是，我不知道如何在第一次，第二次，第三次访问中分析性能，尤其是当用户可能访问过1-5万次的情况下。

I know I can use groupby and count to anaylze performance for those visitors who came at least x amount of times vs those that came less. However, I don't know how to anaylze performance on the first, second, third visit, especially when users could have visited anywhere from 1-50,000 times.

我想回答的示例问题...对于访问次数至少10次的用户，他们在前5次访问还是在第二次5次访问中表现更好？

So an example question I'd like to answer... for users who visit at least 10 times, do they perform better on their first 5 visits or their 2nd five?

非常

谢谢。

推荐答案

一个（可能很慢）选项是使用相关子查询：

One (potentially slow) option is to use a correlated subquery:

SELECT  t.UserID, 
        t.Timestamp, 
        t.VisitID,
        (   SELECT  COUNT(*) + 1
            FROM    T T2
            WHERE   T2.UserID = T.UserID
            AND     T2.TimeStamp < T.TimeStamp
        ) VisitNumber
FROM    T;

SQL小提琴上的示例

Example on SQL Fiddle

最简单（尽管可能不可行）的解决方案是升级到Postgresql 8.4或更高版本，并利用ROW_NUMBER：

The easiest (although possibly not viable) solution would be to upgrade to Postgresql 8.4 or later and take advantage of ROW_NUMBER:

SELECT  t.UserID, 
        t.Timestamp, 
        t.VisitID,
        ROW_NUMBER() OVER(PARTITION BY t.UserID ORDER BY t.Timestamp) AS VisitNumber
FROM    T;

关于SQL Fiddle的示例

Example on SQL Fiddle

最后，如果第一个查询太慢，并且升级不是一个选项，下面的文章详细介绍了另一个选择。我不知道该如何执行。

Finally, if the first query is too slow, and upgrading is not an option the following article goes into some detail about a further alternative. I have no idea how this will perform though.

http://www.depesz.com/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/

这篇关于为数据库中的行分配访问号？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为数据库中的行分配访问号？ [英] Assigning a visit number to rows in a database?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为数据库中的行分配访问号？ [英] Assigning a visit number to rows in a database?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭