计数时间戳列中相等的日期部分,用所述列对各组进行交叉 [英] Count equal date parts of the timestamp column ACROSS the groups by the said column
问题描述
我想计算重复的日期值.我已经知道我的字段"WHEN"是一个时间戳,因此必须将其强制转换为日期类型.
I would like to count duplicate date values. I already know that my field "WHEN" is a timestamp, so I have to cast it to type of date.
我的实际查询如下:
SELECT
u.USERNAME,
r."WHEN",
r.UPDATEINOUT,
case (r.UPDATEINOUT) when 0 then 0 when 1 then 1 else r.INOUT end INOUT
FROM
ATTENDANT r
LEFT JOIN
USERS u ON r.USERID = u.ID
where
u.USERNAME = 'rk' and (r.UPDATEINOUT = 1 or r.UPDATEINOUT = 0 or r.UPDATEINOUT is null)
group by
r."WHEN",
INOUT,
u.USERNAME,
r.UPDATEINOUT
order by
r."WHEN"
这是结果:
Username WHEN UPDATEINOUT INOUT
rk 09.04.2018, 14:59:45.000 [null] 0
rk 09.04.2018, 14:59:51.000 [null] 1
rk 11.04.2018, 08:31:02.000 [null] 0
rk 11.04.2018, 12:06:52.000 [null] 1
rk 11.04.2018, 12:10:29.000 [null] 0
rk 11.04.2018, 12:23:09.000 [null] 1
rk 11.04.2018, 12:43:47.000 [null] 0
rk 11.04.2018, 17:07:40.000 [null] 1
现在,我想计算重复的日期,结果应如下所示:
Now I would like to count the duplicate dates, the result should look like this:
Username WHEN UPDATEINOUT INOUT Count
rk 09.04.2018, 14:59:45.000 [null] 0 2
rk 09.04.2018, 14:59:51.000 [null] 1 2
rk 11.04.2018, 08:31:02.000 [null] 0 6
rk 11.04.2018, 12:06:52.000 [null] 1 6
rk 11.04.2018, 12:10:29.000 [null] 0 6
rk 11.04.2018, 12:23:09.000 [null] 1 6
rk 11.04.2018, 12:43:47.000 [null] 0 6
rk 11.04.2018, 17:07:40.000 [null] 1 6
当我添加
count(cast(r."WHEN" as date))
然后它只显示一个给我.
then it shows me only one.
推荐答案
似乎有一些答案提示该做什么",但并没有试图解释结果为何如此.怎么了. 似乎该主题入门者正在得到一条鱼,而不是一条钓鱼竿.
It seems there are answers suggesting "what to do" but not trying to explain why the results are like what they are. What is happening. It seems like the topic starter is being given a fish, not a fishing rod.
当我添加
count(cast(r."WHEN" as date))
时,它只显示一个.
When I add
count(cast(r."WHEN" as date))
then it shows me only one.
这有点有趣.此行显示了新手对自然人类语言(例如英语)和数学集合语言(例如SQL)之间的误解.
This is a bit funny. This line shows novice misconception between natural human language like English, and mathematical sets languages like SQL.
R.Kut将此行读为我想count
given expression
有多少[distinct]个值".
R.Kut reads this his added line as "i want to count
how many [distinct] values of the given expression
are there".
但这不是该命令在SQL中的实际含义.在SQL中,这表示我要count
有多少行,其中given expression
不为空."
But that is not what this command actually means in SQL. In SQL it means "I want to count
how many rows are there, where the given expression
is not null."
因此count(cast(r."WHEN" as date))
和count(r."WHEN")
之间实际上没有区别-这两个参数表达式都是NULL
或都是NOT NULL
.因此,这些同样可为空的参数的count
的值也相等.
And so there is actually NO DIFFERENCE between count(cast(r."WHEN" as date))
and count(r."WHEN")
- those two parameters-expressions are either both NULL
or both NOT NULL
. Hence the value of count
of those equally nullable parameters is equal too.
试图截断聚合函数的参数值(如它可以改变可为空性)的尝试是一种误解.我也在那里.要花些时间才能掌握数学集上的含义真正的含义,并且在阅读SQL时不会阅读英语.
The very attempt to truncate a value of the parameter of the aggregate function, like if it could alter nullability, is a misconception. I was there too. It takes time to get used what aggregates over mathematical sets really mean, and that you do not reading English when you are reading SQL.
坦率地说,您可以在此处执行count (1)
,不仅删除类型转换,还删除列本身-仍然是相同的,因为这些是行,而不是要求计数的值. 除非,否则某些行中的"WHEN" IS NULL
-将由group by
而不是count
解释. 之后,您将阅读并仔细考虑下一部分,然后回过头来玩弄 http://sqlfiddle.com/#!9/ee09a/7
Frankly, you could do just count (1)
here removing not only the typecast but the column itself - it would still be the same, as those are rows, not values that are being asked to be counted. Unless there are rows where "WHEN" IS NULL
- that would be accounted for by group by
but not by count
. After you read and think through the next section, come back and toy with http://sqlfiddle.com/#!9/ee09a/7
现在,我上面已经提到了count
函数的另一个参数.就是那个与众不同"的参数.
Now, there is another parameter to the count
function that I kind of mentioned above. It is that "distinct" parameter.
备注:可能会说distinct
是SQL语言的关键字,而不是函数的参数,但事实上,而不是法律上的事实,它改变了函数的工作方式,因此我认为是参数,以一种不常见的SQLish方式,通常将参数分配给函数.或者,可以用另一种方式对此进行推理,如果想像我们有两个函数可供选择,它们是count
和count-distinct
.
Remark: One may say distinct
is a keyword of the SQL language, not a parameter for a function, but de facto, not de jure, it changes the way the function works, so in my perception it is the parameter, in that unusual SQLish way the parameters are often given to functions. Or, another way to reason about this, it may be seen a part of the function name, if to imagine have we two functions to choose from, count
and count-distinct
.
因此,主题入门者可以改为添加count(distinct cast(r."WHEN" as date))
,然后...
So, the topic starter could had added count(distinct cast(r."WHEN" as date))
instead, and...
....然后看到什么都没有改变.因为这一次他确实会告诉服务器使用非空(总是只有非空!)和distinct
值对行进行计数-计数在组中进入 .
....and see that nothing changed. Because while this time he would really say to the server to count row with not-null (always only not-null!) and distinct
values - the counting goes within the group.
那些团体是什么?
group by
r."WHEN",
INOUT,
u.USERNAME,
r.UPDATEINOUT
请参阅,在所有组中,我们都有带有不同时间和日期值"WHEN"的行.和其他专栏也一样,但我不在此关注.这里的问题是,在每个组中在"WHEN"的时间和日期部分都保持不变.并且相同"是指存在一个distinct
值,一次又一次地重复".并且,如果只有一个distinct
时间和日期值,则减少的only-time或only-date值将具有相同的计数(减少值只能使先前不同的值相等,而不能使先前相等值现在不同).
See, across the groups we have rows with distinct time-and-date values of "WHEN". And other columns too, but I am not focusing on them. The thing here is that within every group both time and date parts of "WHEN" are kept the same. And "being the same" means "there is one distinct
value, repeated again and again". And if there is only one distinct
time-and-date value then the reduced values of only-time or only-date would have the same counts (reducing a value can only make to previously different values equal, but not make previously equal values now different).
在其他情况下,如果对不包含在组中的列进行计数(或者完全没有group by
子句),则授予结果可能会有所不同. count(distinct ...)
可能会满足主题启动程序的期望.
Granted, in other situations, when the counting is going over columns not included in the group (or when there is not group by
clause at all), the result may differ. There count(distinct ...)
might do what the topic starter is expecting it to do.
http://sqlfiddle.com/#!9/0d65bf/7 -示例.
但是必须牢记:
- 以服务器完成额外的排序和分组工作为代价,这可能会使查询变慢或消耗大量内存
- 它仍然可以在组中工作(只是组然后将开始为
count
ed列包含不同的值). - ....有时,如果程序员没有通过添加
group by
子句:-D 来将其设置为不同的话,则该组有时是整个查询结果集(最简单的变体-整个表).
- that happens at the expense of the extra sorting and grouping work done by a server, potentially making the query slow or consuming lot of memory
- still it would work within the group (just groups then would start to contain different values for the
count
ed columns). - ....just sometimes the group is the whole query resultset (most simple variant - the whole table), if the programmer did not set it to be different by adding
group by
clause :-D
该示例的代码,链接在上方:
The code of the example, linked above:
create table X(a integer, b integer);
insert into X values (1,1);
insert into X values (1,2);
insert into X values (1,2);
commit;
select count(distinct b) from x group by a
-- Result: 1 row: 2
-- or if the whole table is the group
select count(distinct b) from x
-- Result: 1 row: 2
-- but if the group includes the counted column
-- then every group would contain EXACTLY ONE
-- row with a not-null distinct value
select count(distinct b) from x group by b
-- Result: 2 rows: 1 and 1
在这里,我们对玩具进行了另一种修改.
Here we come to yet another modification to toy with.
group by
cast(r."WHEN" as date), -- <====
INOUT,
u.USERNAME,
r.UPDATEINOUT
现在,这次我们告诉服务器组装这些组,其中"WHEN"的"date"部分是相同的,而"time"部分可能是不同的.
Now, this time we are telling the server to assemble the groups, in which only "date" part of "WHEN" is the same, while "time" part might differ.
但是......
- 我认为并非每个SQL Server都支持
group by
子句中的表达式. - 除非使用该表达式精确地创建了
index
,否则服务器将不得不做很多额外的工作,可能需要借助自然扫描和临时文件分组,这会使查询变得繁琐而缓慢. - 当然,您必须相应地修改列列表.
- I think not every SQL server supports expressions in
group by
clause. - Unless there is an
index
made exactly by that expression, the server would have to do much extra work, probably resorting to natural scan and temporary files grouping, which would make the query heavy and slow. - And of course you would HAVE to modify the columns list accordingly.
.
SELECT
u.USERNAME,
cast(r."WHEN" as date), -- <=== no more raw r."WHEN"
r.UPDATEINOUT,
因为,如果您明确要求服务器对它的不同值进行分组,那么在行中WHEN
就不能具有一个真值".
Because, well, you just cannot have "One True Value" for WHEN
in the row, if you explicitly asked server to group different values of it.
在这里您可能得出结论,没有简单直接的方法可以同时拥有组和跨组的聚合.或者换句话说,在一个查询中有两组不同的group by
(我不想在这里提到UNION
,mkay?).
Here you may come to the conclusion, that there is no simple straightforward way to have both groups and across-the-groups aggregates. Or to put it differently, to have two different sets of group by
in one query (I do not want to mention UNION
here, mkay?).
您将需要一组group by
标准来对具有相同日期部分(但可能是不同的时间部分)的行进行计数,而又需要另一组group by
标准来选择和呈现日期和时间不同的组.
You would need ONE set of group by
criteria for counting rows with the same date part (but maybe different time part), and yet ANOTHER group by
criteria for picking and rendering groups different by both date-and-time.
在简单明了的SQL 1999中,这意味着您必须制作两个select
才能以一种或另一种方式对两个分组进行分组,这时Yogesh和Gordon的答案就加入了.
In plain and simple SQL 1999 that means you would have to make TWO select
s to make both groupings one way or another, and here Yogesh's and Gordon's answer kick in.
就像99年后SQL中提到的戈登一样,出现了窗口函数,该函数可以让您将这些条件集放在一个select
中,但无论如何在Firebird 2.x中都不可用.
Like Gordon mentioned in post-99 SQL there came Window Functions which could let you have those criteria sets in one select
, but they are not available in Firebird 2.x anyway.
这篇关于计数时间戳列中相等的日期部分,用所述列对各组进行交叉的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!