如何计算不同的多个字段而不重复查询? [英] How can I count distinct multiple fields without repeating the query?

查看:181
本文介绍了如何计算不同的多个字段而不重复查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个查询,其中包含几个返回每月计数的分组。像这样:

I have a query with several groupings that returns a count per month. Something like this:

SELECT field1, field2, year(someDate), month(someDate), count(*) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY field1, field2, year(someDate), month(someDate)

问题是,我想根据id字段+日期字段(没有时间),每天不同的计数。因为,我想获得每天,每月的ids的独特计数。所以我想要这样的东西:

The problem is that I want the count to be distinct per day, based on an id field + the date field (without the time). As in, I want to get the distinct count of ids each day, per month. So I want something like this:

SELECT field1, field2, year(someDate), month(someDate), 
       count(distinct someID, someDate) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY field1, field2, year(someDate), month(someDate)

这有两个问题:


  1. 您不能为计数聚合列出2个不同的字段

  2. 这将包括日期的时间,因此它不会过滤任何内容,因为它几乎总是有不同的时间

我可以轻松地通过转换为varchar的日期,但我不知道如何处理多个不同字段的问题。我无法使用此解决方案,因为我不想重复整个where子句和group by子句。这是我想出来的:

I can take care of 2. easily by converting to a varchar of just the date, but I'm not sure how to deal with the multiple distinct fields issue. I can't use this solution, as I don't want to repeat the entire where clause and group by clause. This is what I've come up with:

SELECT field1, field2, year(someDate), month(someDate), 
       count(distinct someID + CONVERT(VARCHAR, someDate, 112)) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY field1, field2, year(someDate), month(someDate)

我不是在逗号分隔的列表中列出不同的字段, 。这个方法有什么缺点,我应该注意吗?我可以指望它是准确的吗?

Instead of listing the distinct fields in a comma-delimited list, I simply concatenated them. Is there any downside to this method that I should look out for? Can I count on it to be accurate? And - is there any better way to accomplish this?

基本上,我每个月进行分组,但不同的计数应该基于天。如果,如果我在1月3日和1月5日的ID为31,我希望它计数为1月1日,但如果我有31两次1月3日,我只想要一次计数。

Basically, I'm grouping per month, but the "distinct" count should be based on day. As in, if I have id 31 on Jan 3 and Jan 5, I want it to count as 2 for January, but if I have id 31 twice on Jan 3, I only want it to count once.

一些基本的示例数据&预期输出(为此跳过field1和field2):

Some basic sample data & expected output (skipping field1 and field2 for this):

*Date*              *ID*
1/3/12 00:00:09     22
1/3/12 00:13:00     22
1/4/12 12:00:00     22
1/7/12 15:00:45     27
1/15/12 15:00:00    22
2/6/12 00:00:09     50
2/8/12 00:13:00     44
2/8/12 12:00:00     45
2/22/12 15:00:45    33
2/22/12 15:00:00    33
2/22/12 15:00:00    44

*Year*  *Month* *Count*
2012    Jan     4
2012    Feb     5


推荐答案

UPDATED

对于您的示例数据,这将提供所需的结果:

Based on your sample data, this gives the required result:

Declare @Tab table ([Date] datetime,ID int)
insert into @Tab([Date],ID) values
('2012-01-03T00:00:09.000', 22),
('2012-01-03T00:13:00.000', 22),
('2012-01-04T12:00:00.000', 22),
('2012-01-07T15:00:45.000', 27),
('2012-01-15T15:00:00.000', 22),
('2012-02-06T00:00:09.000', 50),
('2012-02-08T00:13:00.000', 44),
('2012-02-08T12:00:00.000', 45),
('2012-02-22T15:00:45.000', 33),
('2012-02-22T15:00:00.000', 33),
('2012-02-22T15:00:00.000', 44)

select DATEADD(month,DATEDIFF(month,0,[Date]),0) as MonthStart,SUM(distinctDayIDs)
from
(
    SELECT DATEADD(day,DATEDIFF(day,0,[Date]),0) as [Date], 
           count(distinct ID) as distinctDayIDs
    FROM @Tab
    --WHERE field5 = 'test'
    GROUP BY DATEADD(day,DATEDIFF(day,0,[Date]),0)
) t
group by DATEADD(month,DATEDIFF(month,0,[Date]),0)

我认为,因为我们必须对每一天进行计数,所以我们必须做两个独立的分组操作。

I think, because we have to do a count for each day, we have to do it as two separate grouping operations.

较旧的答案

听起来像所需的输出会是 field1 field2 ,日期和该日期不同ID的计数?

It sounds like the desired output would be field1, field2, the date, and the count of distinct IDs on that date?

如果是,我认为你过于复杂的事情:

If so, I think you're overcomplicating things:

SELECT field1, field2, DATEADD(day,DATEDIFF(day,0,someDate),0) as Date, 
       count(distinct someID) as myCount
FROM myTable
WHERE field5 = 'test'
GROUP BY ffield1, field2, DATEADD(day,DATEDIFF(day,0,someDate),0)

code> DATEADD / DATEDIFF 去掉时间部分,而不是转换为 varchar

(I'm using DATEADD/DATEDIFF to strip the time portion, rather than converting to varchar)

这篇关于如何计算不同的多个字段而不重复查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆