计数时间戳列中相等的日期部分,用所述列对各组进行交叉 [英] Count equal date parts of the timestamp column ACROSS the groups by the said column

查看:79
本文介绍了计数时间戳列中相等的日期部分,用所述列对各组进行交叉的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算重复的日期值.我已经知道我的字段"WHEN"是一个时间戳,因此必须将其强制转换为日期类型.

I would like to count duplicate date values. I already know that my field "WHEN" is a timestamp, so I have to cast it to type of date.

我的实际查询如下:

SELECT 
    u.USERNAME,  
    r."WHEN",
    r.UPDATEINOUT,
    case (r.UPDATEINOUT) when 0 then  0 when 1 then 1 else r.INOUT end INOUT
FROM 
    ATTENDANT r 
LEFT JOIN  
    USERS u ON r.USERID = u.ID 
where 
    u.USERNAME = 'rk' and (r.UPDATEINOUT = 1 or r.UPDATEINOUT = 0 or r.UPDATEINOUT is null)
group by
    r."WHEN",
    INOUT,
    u.USERNAME,
    r.UPDATEINOUT
order by 
    r."WHEN"

这是结果:

Username     WHEN             UPDATEINOUT  INOUT

rk  09.04.2018, 14:59:45.000    [null]  0
rk  09.04.2018, 14:59:51.000    [null]  1
rk  11.04.2018, 08:31:02.000    [null]  0
rk  11.04.2018, 12:06:52.000    [null]  1
rk  11.04.2018, 12:10:29.000    [null]  0
rk  11.04.2018, 12:23:09.000    [null]  1
rk  11.04.2018, 12:43:47.000    [null]  0
rk  11.04.2018, 17:07:40.000    [null]  1

现在,我想计算重复的日期,结果应如下所示:

Now I would like to count the duplicate dates, the result should look like this:

Username     WHEN             UPDATEINOUT  INOUT  Count

    rk  09.04.2018, 14:59:45.000    [null]  0       2
    rk  09.04.2018, 14:59:51.000    [null]  1       2
    rk  11.04.2018, 08:31:02.000    [null]  0       6
    rk  11.04.2018, 12:06:52.000    [null]  1       6
    rk  11.04.2018, 12:10:29.000    [null]  0       6
    rk  11.04.2018, 12:23:09.000    [null]  1       6
    rk  11.04.2018, 12:43:47.000    [null]  0       6
    rk  11.04.2018, 17:07:40.000    [null]  1       6

当我添加

count(cast(r."WHEN" as date))

然后它只显示一个给我.

then it shows me only one.

推荐答案

似乎有一些答案提示该做什么",但并没有试图解释结果为何如此.怎么了. 似乎该主题入门者正在得到一条鱼,而不是一条钓鱼竿.

It seems there are answers suggesting "what to do" but not trying to explain why the results are like what they are. What is happening. It seems like the topic starter is being given a fish, not a fishing rod.

当我添加count(cast(r."WHEN" as date))时,它只显示一个.

When I add count(cast(r."WHEN" as date)) then it shows me only one.

这有点有趣.此行显示了新手对自然人类语言(例如英语)和数学集合语言(例如SQL)之间的误解.

This is a bit funny. This line shows novice misconception between natural human language like English, and mathematical sets languages like SQL.

R.Kut将此行读为我想count given expression有多少[distinct]个值".

R.Kut reads this his added line as "i want to count how many [distinct] values of the given expression are there".

但这不是该命令在SQL中的实际含义.在SQL中,这表示我要count有多少,其中given expression不为空."

But that is not what this command actually means in SQL. In SQL it means "I want to count how many rows are there, where the given expression is not null."

因此count(cast(r."WHEN" as date))count(r."WHEN")之间实际上没有区别-这两个参数表达式都是NULL或都是NOT NULL.因此,这些同样可为空的参数的count的值也相等.

And so there is actually NO DIFFERENCE between count(cast(r."WHEN" as date)) and count(r."WHEN") - those two parameters-expressions are either both NULL or both NOT NULL. Hence the value of count of those equally nullable parameters is equal too.

试图截断聚合函数的参数值(如它可以改变可为空性)的尝试是一种误解.我也在那里.要花些时间才能掌握数学集上的含义真正的含义,并且在阅读SQL时不会阅读英语.

The very attempt to truncate a value of the parameter of the aggregate function, like if it could alter nullability, is a misconception. I was there too. It takes time to get used what aggregates over mathematical sets really mean, and that you do not reading English when you are reading SQL.

坦率地说,您可以在此处执行count (1),不仅删除类型转换,还删除列本身-仍然是相同的,因为这些是行,而不是要求计数的值. 除非,否则某些行中的"WHEN" IS NULL-将由group by而不是count解释. 之后,您将阅读并仔细考虑下一部分,然后回过头来玩弄 http://sqlfiddle.com/#!9/ee09a/7

Frankly, you could do just count (1) here removing not only the typecast but the column itself - it would still be the same, as those are rows, not values that are being asked to be counted. Unless there are rows where "WHEN" IS NULL - that would be accounted for by group by but not by count. After you read and think through the next section, come back and toy with http://sqlfiddle.com/#!9/ee09a/7

现在,我上面已经提到了count函数的另一个参数.就是那个与众不同"的参数.

Now, there is another parameter to the count function that I kind of mentioned above. It is that "distinct" parameter.

备注:可能会说distinct是SQL语言的关键字,而不是函数的参数,但事实上,而不是法律上的事实,它改变了函数的工作方式,因此我认为是参数,以一种不常见的SQLish方式,通常将参数分配给函数.或者,可以用另一种方式对此进行推理,如果想像我们有两个函数可供选择,它们是countcount-distinct.

Remark: One may say distinct is a keyword of the SQL language, not a parameter for a function, but de facto, not de jure, it changes the way the function works, so in my perception it is the parameter, in that unusual SQLish way the parameters are often given to functions. Or, another way to reason about this, it may be seen a part of the function name, if to imagine have we two functions to choose from, count and count-distinct.

因此,主题入门者可以改为添加count(distinct cast(r."WHEN" as date)),然后...

So, the topic starter could had added count(distinct cast(r."WHEN" as date)) instead, and...

....然后看到什么都没有改变.因为这一次他确实会告诉服务器使用非空(总是只有非空!)和distinct值对行进行计数-计数在组中进入 .

....and see that nothing changed. Because while this time he would really say to the server to count row with not-null (always only not-null!) and distinct values - the counting goes within the group.

那些团体是什么?

group by
    r."WHEN",
    INOUT,
    u.USERNAME,
    r.UPDATEINOUT

请参阅,在所有组中,我们都有带有不同时间和日期值"WHEN"的行.和其他专栏也一样,但我不在此关注.这里的问题是,在每个组中在"WHEN"的时间和日期部分都保持不变.并且相同"是指存在一个distinct值,一次又一次地重复".并且,如果只有一个distinct时间和日期值,则减少的only-time或only-date值将具有相同的计数(减少值只能使先前不同的值相等,而不能使先前相等值现在不同).

See, across the groups we have rows with distinct time-and-date values of "WHEN". And other columns too, but I am not focusing on them. The thing here is that within every group both time and date parts of "WHEN" are kept the same. And "being the same" means "there is one distinct value, repeated again and again". And if there is only one distinct time-and-date value then the reduced values of only-time or only-date would have the same counts (reducing a value can only make to previously different values equal, but not make previously equal values now different).

在其他情况下,如果对不包含在组中的列进行计数(或者完全没有group by子句),则授予结果可能会有所不同. count(distinct ...)可能会满足主题启动程序的期望.

Granted, in other situations, when the counting is going over columns not included in the group (or when there is not group by clause at all), the result may differ. There count(distinct ...) might do what the topic starter is expecting it to do.

http://sqlfiddle.com/#!9/0d65bf/7 -示例.

但是必须牢记:

  • 以服务器完成额外的排序和分组工作为代价,这可能会使查询变慢或消耗大量内存
  • 它仍然可以在组中工作(只是组然后将开始为count ed列包含不同的值).
  • ....有时,如果程序员没有通过添加group by子句:-D
  • 来将其设置为不同的话,则该组有时是整个查询结果集(最简单的变体-整个表).
  • that happens at the expense of the extra sorting and grouping work done by a server, potentially making the query slow or consuming lot of memory
  • still it would work within the group (just groups then would start to contain different values for the counted columns).
  • ....just sometimes the group is the whole query resultset (most simple variant - the whole table), if the programmer did not set it to be different by adding group by clause :-D

该示例的代码,链接在上方:

The code of the example, linked above:

create table X(a integer, b integer);

insert into X values (1,1);
insert into X values (1,2);
insert into X values (1,2);

commit;

select count(distinct b) from x group by a
-- Result: 1 row: 2

-- or if the whole table is the group
select count(distinct b) from x 
-- Result: 1 row: 2

-- but if the group includes the counted column
-- then every group would contain EXACTLY ONE
-- row with a not-null distinct value
select count(distinct b) from x group by b
-- Result: 2 rows: 1 and 1


在这里,我们对玩具进行了另一种修改.


Here we come to yet another modification to toy with.

group by
    cast(r."WHEN" as date),   -- <====
    INOUT,
    u.USERNAME,
    r.UPDATEINOUT

现在,这次我们告诉服务器组装这些组,其中"WHEN"的"date"部分是相同的,而"time"部分可能是不同的.

Now, this time we are telling the server to assemble the groups, in which only "date" part of "WHEN" is the same, while "time" part might differ.

但是......

  1. 我认为并非每个SQL Server都支持group by子句中的表达式.
  2. 除非使用该表达式精确地创建了index,否则服务器将不得不做很多额外的工作,可能需要借助自然扫描和临时文件分组,这会使查询变得繁琐而缓慢.
  3. 当然,您必须相应地修改列列表.
  1. I think not every SQL server supports expressions in group by clause.
  2. Unless there is an index made exactly by that expression, the server would have to do much extra work, probably resorting to natural scan and temporary files grouping, which would make the query heavy and slow.
  3. And of course you would HAVE to modify the columns list accordingly.

.

SELECT 
    u.USERNAME,  
    cast(r."WHEN" as date),  -- <=== no more raw r."WHEN"
    r.UPDATEINOUT,

因为,如果您明确要求服务器对它的不同值进行分组,那么在行中WHEN就不能具有一个真值".

Because, well, you just cannot have "One True Value" for WHEN in the row, if you explicitly asked server to group different values of it.

在这里您可能得出结论,没有简单直接的方法可以同时拥有组和跨组的聚合.或者换句话说,在一个查询中有两组不同的group by(我不想在这里提到UNION,mkay?).

Here you may come to the conclusion, that there is no simple straightforward way to have both groups and across-the-groups aggregates. Or to put it differently, to have two different sets of group by in one query (I do not want to mention UNION here, mkay?).

您将需要一组group by标准来对具有相同日期部分(但可能是不同的时间部分)的行进行计数,而又需要另一组group by标准来选择和呈现日期和时间不同的组.

You would need ONE set of group by criteria for counting rows with the same date part (but maybe different time part), and yet ANOTHER group by criteria for picking and rendering groups different by both date-and-time.

在简单明了的SQL 1999中,这意味着您必须制作两个select才能以一种或另一种方式对两个分组进行分组,这时Yogesh和Gordon的答案就加入了.

In plain and simple SQL 1999 that means you would have to make TWO selects to make both groupings one way or another, and here Yogesh's and Gordon's answer kick in.

就像99年后SQL中提到的戈登一样,出现了窗口函数,该函数可以让您将这些条件集放在一个select中,但无论如何在Firebird 2.x中都不可用.

Like Gordon mentioned in post-99 SQL there came Window Functions which could let you have those criteria sets in one select, but they are not available in Firebird 2.x anyway.

这篇关于计数时间戳列中相等的日期部分,用所述列对各组进行交叉的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆