使用带分组的 SQLite 计算模式 [英] Calculating the mode with SQLite with grouping
问题描述
我有一个包含 ID(IP 地址)和一个因子变量(Web 浏览器)的表,我需要制作另一个表,其中每个 ID 都有一个记录,以及因子变量的模式.我在想像 SELECT ip, MODE (browser) FROM log GROUP BY ip
之类的东西.
I have a table with ID (IP address) and a factor variable (Web browser) and I need to make another table that has a single record for each ID, together with the mode of the factor variable. I was thinking of something like SELECT ip, MODE (browser) FROM log GROUP BY ip
.
不幸的是,SQLite 没有实现 MODE
函数,所以这不起作用.我想用每个浏览器的计数建立一个临时表,然后使用 SELECT DISTINCT ON
或 RANK ()
语句,但 SQLite 也不支持这些.
Unfortunately, SQLite doesn't implement a MODE
function, so this doesn't work. I thought of building a temporary table with the counts of each browser and then using a SELECT DISTINCT ON
or a RANK ()
statement but SQLite doesn't support these either.
此外,最好在单个语句中进行此操作,因为我还需要其他几个因素的模式(并且也按相同的 ID 分组).
Additionally, it would be nice to this in a single statement because there are several other factors whose mode I also need (and are also grouped by the same ID).
推荐答案
要计算模式,按 browser
列分组,获取每个模式的 COUNT(*)
分组,按该值排序,并取最大值的记录.
To compute the mode, group by the browser
column, get the COUNT(*)
for each group, sort by that value, and take the record with the largest value.
如果您已经有另一个 GROUP BY,请使用相关子查询:
If you already have another GROUP BY, use a correlated subquery:
SELECT ip,
(SELECT browser
FROM log AS log2
WHERE ip = ips.ip
GROUP BY browser
ORDER BY COUNT(*) DESC
LIMIT 1)
FROM (SELECT DISTINCT ip
FROM log) AS ips
这篇关于使用带分组的 SQLite 计算模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!