在数据库中分离新用户和返回用户 [英] Separate new and returned users in database
问题描述
我有一个很大的数据库,但是出于这个问题的目的我已经简化了它:
基本上,每次用户在我的网站上单击某些内容时,它都会在数据库中记录为一行: UserID
字段是用于标识用户的Cookie,而 Stuff
和 MoreStuff
是有关点击的数据。注意,显然,每次 Foo和 Foo都可能不同。和 Bar
我想做的是:使用SQL查询,过滤掉所有的第一次访问或所有重复的访问(我假设是否可以做一个,我可以为另一个求反。因此,如果我要过滤掉样本中所有的重复访问,我将得到:
,绿色表示所选行,红色表示拒绝的行。
操作方法
您可以使用聚合函数 COUNT()
,然后 HAVING
语句,例如:
SELECT userID,来自tbl
的COUNT(用户ID)
来自用户ID的组
有COUNT(用户ID)> = 2
如果您想过滤谁是重复访问者,则可以重复使用以上查询:
SELECT *从tbl
存在
(
选择userID,COUNT(userID)
从tbl
GROUP BY用户ID
有COUNT个(userID)> == 2
)
GROUP BY用户ID
希望看到您可以更改的首次访问者 HAVING COUNT(userID)> = 2
到 HAVING COUNT(userID)= 1
或当然代替 EXISTS
使用不存在
。
请参见 演示
I have quite a large DB, but I've simplified it for the purpose of this question:
Basically, every time a user clicks something on my site, it gets logged as a row in my DB: the UserID
field is a cookie that is used to identify the user, and Stuff
and MoreStuff
are data about the click. Note that, obviously, these are likely to be different every time, "Foo" and "Bar" is just a representation.
What I want to do is this: with an SQL query, filter out either all of the first visits, or all of the repeated visits (I assume if I can do one, I can invert my filter for the other). So, if I were to filter out all of the repeat visits on my sample, I'd get this:
with green representing the selected rows and red representing the rejected ones.
How can I do this with just SQL?
You could use aggregate function COUNT()
and then HAVING
statement like:
SELECT userID, COUNT(userID)
FROM tbl
GROUP BY userID
HAVING COUNT(userID) >= 2
You can then re-use the above query if you want to filter out who are the repeat visitors like:
SELECT * FROM tbl
WHERE EXISTS
(
SELECT userID, COUNT(userID)
FROM tbl
GROUP BY userID
HAVING COUNT(userID) >= 2
)
GROUP BY userID
If you want to see first-time visitors you could change HAVING COUNT(userID) >= 2
to HAVING COUNT(userID) = 1
or of course instead of EXISTS
use NOT EXISTS
.
See Demo
这篇关于在数据库中分离新用户和返回用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!