有条件的总结 [英] Conditional summaring

查看:81
本文介绍了有条件的总结的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好

我正在尝试按照以下格式创建一个夏季降雨数据表,其中每0.1毫米降雨量保持一个记录

: br />

电台名称,时间,价值

A,2002-12-03 14:44:41.000,0.1

A,2002 -12-03 14:45:45.000,0.1

A,2002-12-03 14:49:45.000,0.1

A,2002-12-09 05: 30:35.000,0.1

A,2002-12-09 05:30:37.000,0.1

A,2002-12-09 05:33:05.000,0.1

B,2002-12-09 05:32:47.000,0.1

B,2002-12-09 05:33:00.000,0.1

B,2002-12-09 05:35:00.000,0.1

B,2002-12-09 05:37:48.000,0.1

....

B,2003-02-09 01:32:47.000,0.1

B,2003-02-09 05:32:47.000,0.1

问题是我需要计算每个站的降雨事件和

产生事件的开始和结束时间(基于时间

字段并总结了活动期间下雨的数量。

An ;事件"对于这种情况,在一系列降雨测量之前和之后,在12小时之间存在至少间隙的情况,并且没有下雨

(即暴雨) 。表中的数据以这样的方式写成

,每行保持雨量为0.1毫米雨量的时间

。 />

例如,上述日期应该产生:


电台名称,开始时间,结束时间,总雨量

A,2002-12-03 14:44:41.000,2002-12-03 14:49:45.000,0.3

A,2002-12-09 05:30:35.000,2002-12- 09 05:33:05.000,0.3

B,2002-12-09 05:32:47.000,2002-12-09 05:37:48.000,0.3

B ,2003-02-09 01:32:47.000,2003-02-09 05:32:47.000,0.2

作为SQL的新手,我不知道它是否可以完成。


非常感谢,

Ilik

解决方案

< blockquote> 2005年2月14日08:00:24 -0800,Ilik写道:

大家好
我正在尝试创建一个夏日的雨数据表每隔0.1毫米降雨量的记录,格式如下:

电台名称,时间,Va Lue
A,2002-12-03 14:44:41.000,0.1
A,2002-12-03 14:45:45.000,0.1
A,2002-12-03 14: 49:45.000,0.1
A,2002-12-09 05:30:35.000,0.1
A,2002-12-09 05:30:37.000,0.1
A,2002-12 -09 05:33:05.000,0.1
B,2002-12-09 05:32:47.000,0.1
B,2002-12-09 05:33:00.000,0.1
B ,2002-12-09 05:35:00.000,0.1
B,2002-12-09 05:37:48.000,0.1
...
B,2003-02-09 01 :32:47.000,0.1
B,2003-02-09 05:32:47.000,0.1

问题是我需要计算每个车站的降雨事件和
产生事件的开始和结束时间(基于时间
字段)并总结事件期间下降的雨量。
事件。就此而言,在一系列降雨测量之前和之后(即暴雨),在至少12小时的间隙和没有降雨的情况下。表格中的数据以这样的方式写成:每行保持雨量计计算下雨量为0.1毫米的时间。

例如,上述日期应该产生:

车站名称,开始时间,结束时间,总雨量
A,2002-12-03 14:44:41.000,2002-12-03 14: 49:45.000,0.3
A,2002-12-09 05:30:35.000,2002-12-09 05:33:05.000,0.3
B,2002-12-09 05:32:47.000 ,2002-12-09 05:37:48.000,0.3
B,2003-02-09 01:32:47.000,2003-02-09 05:32:47.000,0.2

作为SQL的新手,我不知道是否可以做到。




嗨Ilik,


它可以完成,但它很复杂。你必须使用几个

技巧,然后将它们组合成一个查询。


技巧1:找到下雨期的开始。没有其他行

的每行在前12个小时内存在时间值,标志着

降雨期的开始。您可以使用NOT EXISTS或使用OUTER JOIN找到这些。

SELECT Station

,时间

来自MyTable AS

什么地方没有(选择*

来自MyTable AS b

WHERE b.Station = a.Station

AND b.Time< ; a.Time

AND b.Time> = DATEADD(小时,-12,a.Time))



SELECT a .Station

,a.Time

来自MyTable AS

LEFT OUTER加入MyTable AS b

ON b .Station = a.Station

AND b.Time< a.Time

AND b.Time> = DATEADD(小时,-12,a.Time)

我将在其余部分使用NOT EXISTS版本这条信息;如果

表现对你很重要,我会建议你测试两个版本。


技巧2:找到下雨期结束。这与技巧1基本上是相同的

技术。


技巧3:匹配每个下雨期的开始和结束时间。对于每个

开始,匹配结束是在

开始之后发生的所有结束的第一个。

SELECT a.Station

,a.Time AS StartTime

,MIN(b.Time)AS EndTime

来自MyTable AS

INNER JOIN MyTable AS b

ON b.Station = a.Station

AND b.Time> a.Time

什么不存在(选择*

来自MyTable AS c

WHERE c.Station = a.Station

AND c.Time< a.Time

AND c.Time> = DATEADD(小时,-12,a.Time))

而不是EXISTS (选择*

来自MyTable AS d

WHERE d.Station = b.Station

AND d.Time> b.T​​ime

AND d.Time< = DATEADD(小时,+ 12,b.Time))

GROUP BY a.Station

,a.Time


技巧4:查找下雨期开始和结束时间之间的所有行。这个

可以通过使用上一步的结果作为派生表

并将其加入原始表来完成,但在这种情况下,我决定

使用一个相关子查询扩展上一步的结果。

SELECT a.Station

,a.Time AS StartTime

,MIN(b.Time)AS EndTime

,(SELECT SUM(e.Value)

来自MyTable AS e

WHERE e .Station = a.Station

AND e.Time> = a.Time

AND e.Time< = MIN(b.Time))AS TotalRain

来自MyTable AS

INNER JOIN MyTable AS b

ON b.Station = a.Station

AND b。时间> a.Time

什么不存在(选择*

来自MyTable AS c

WHERE c.Station = a.Station

AND c.Time< a.Time

AND c.Time> = DATEADD(小时,-12,a.Time))

而不是EXISTS (选择*

来自MyTable AS d

WHERE d.Station = b.Station

AND d.Time> b.T​​ime

AND d.Time< = DATEADD(小时,+ 12,b.Time))

GROUP BY a.Station

,a.Time


注意:上面的所有查询都是未经测试的。要获得经过测试的回复,您需要

来发布将在我的系统上重新创建表和数据的SQL,如图所示

in www.aspfaq.com/5006


Best,Hugo

-


(删除_NO_和_SPAM_以获取我的电子邮件地址)


[已发布和邮寄,请回复新闻]


Ilik(le***@walla.co.il)写道:

问题是我需要算下雨事件对于每个站点,并产生事件的开始和结束时间(基于时间
字段)并总结事件期间下降的雨量。
一个事件。就此而言,在一系列降雨测量之前和之后(即暴雨),在至少12小时的间隙和没有降雨的情况下。表格中的数据以这样的方式写成:每行保持雨量计计算下雨量为0.1毫米的时间。




绝不是一个小问题。正如Hugo所说,CREATE TABLE和INSERT

语句是好事。然后,在这种情况下,表格很容易构成,就像INSERT语句一样。而你的

叙述和样本数据非常好。谢谢!


这是一个多步骤的方法。也许不是最优雅的,但是它会给出正确的结果。如果数据量很大,那么性能可能是一个问题。


关于日期运动的特别说明。我每分钟做好一次,

不是每小时。这是因为约会(HOUR,''00:15:00'','12:10:10'')

是12,虽然它不是完全12小时。 dateiff总是算是

跨越边界。此外,我发现coalesce需要特别照顾。

最初我有19000101和99991231,但这让我溢出。


CREATE TABLE数据(站点字符(1)NOT NULL,

time datetime NOT NULL,

value float NOT NULL,

CONSTRAINT pk_data PRIMARY KEY(station,time))

go

INSERT数据(电台,时间,价值)

SELECT''A'',''20021203 14:44:41.000' ',0.1 UNION

SELECT''A'',''20021203 14:45:45.000'',0.1 UNION

SELECT''A'','''20021203 14:49:45.000'',0.1 UNION

SELECT''A'',''20021209 05:30:35.000'',0.1 UNION

SELECT''A '',''20021209 05:30:37.000'',0.1 UNION

SELECT''A'',''20021209 05:33:05.000'',0.1 UNION

SELECT''B'',''20021209 05:32:47.000'',0.1 UNION

SELECT''B'',''20021209 05:33:00.000'',0.1 UNION

SELECT''B'',''20021209 05:35:00.000'',0.1 UNION

SELECT''B'',''20021209 05:37 :48.000'',0.1 UNION

SELECT''B'',''20030209 01:32:47.000'',0.1 UNION

SELECT''B'', ''20030209 05:32:47.000'',0.1 UNION

SELECT''B'',''20030209 18:32:47.000'',0.1 UNION

SELECT ''B'',''20030212 05:32:47.000'',0.1

go

CREATE TABLE #temp(station char(1)NOT NULL,

time datetime NOT NULL,

n int NOT NULL,

starttime datetime NULL,

endtime datetime NULL,

值浮动NOT NULL,

PRIMARY KEY(站,n))

go

INSERT #temp(站,时间,值,n)

SELECT a.station,a.time,a.value,

(SELECT COUNT(*)+ 1 FROM FROM b

WHERE a.station = b.station AND a.time> b.time)

来自数据a

go

更新b

SET starttime = CASE WHEN dateiff(分钟) ,

合并(a.time,dateadd(DAY,-1,b.time)),

b.time)> 60 * 12

那么b.time

结束,

endtime = CASE WHEN dateiff(分钟,

b .time,

coalesce(c.time,dateadd(DAY,1,b.time)))> 60 * 12

那么b.time

结束

来自#temp b

LEFT JOIN #temp a ON a.station = b.station AND bn - 1 = a

LEFT JOIN #temp c ON b.station = c.station AND bn + 1 = cn

go

UPDATE #temp

SET starttime =(SELECT MAX(b.starttime)

FROM #temp b

WHERE b.station = a.station

AND b.starttime< a.time)

FROM #temp a

WHERE a.starttime IS NULL

go

SELECT站,starttime,MAX(结束时间),SUM(值)

FROM #temp

GROUP BY电台,开始时间

go

DROP TABLE #temp

DROP TABLE数据

-

Erland Sommarskog,SQL Server MVP, es****@sommarskog.se


SQL Server SP3联机丛书
http://www.microsoft。 com / sql / techinf ... 2000 / books.asp


2005年2月15日星期二21:28:13 +0000(UTC),Erland Sommarskog写道:

绝不是一个小问题。正如Hugo所说,CREATE TABLE和INSERT
语句是好事。然后,在这种情况下,表格很容易组成,INSERT语句也是如此。


嗨Erland,


你说得对。我想这只是表明我很懒;-)


我只是偷了你的工作来测试我的查询并发现了一个小错误。

更正版本(适用于您选择的表格和列名称)是


SELECT a.station

,a.time AS Starttime

,MIN(b.time)AS Endtime

,(SELECT SUM(e.value)

FROM FROM data AS e

WHERE e.station = a.station

AND e.time> = a.time

AND e.time< = MIN(b.time))AS TotalRain

来自数据AS

INNER JOIN数据AS b

ON b.station = a。车站

和b.time> = a.time - 改变是在这一行

什么不存在(选择*

FROM数据AS c

WHERE c.station = a.station

AND c.time< a.time

AND c.time> = DATEADD(小时,-12,a.time))

而不是EXISTS(SELECT *

FROM data AS d

WHERE d。 station = b.station

AND d.time> b.time

AND d.time< = DATEADD(小时,+ 12,b.time))

GROUP B Y a.station

,a.time

你的
叙述和样本数据非常好。谢谢!


我只能说它。


这是一个多步骤的方法。也许不是最优雅的,但它会给出正确的结果。如果数据量很大,性能可能会成为一个问题。


我为第一印象做了一些快速测试。根据两个查询从开始到结束时间的差异

,使用提供的示例数据,

它太靠近了。增加测试数据量(复制

插入两次并将电台名称更改为C,D,E和F)

为我带来了一个小优势查询,但测试数据如此之少,

并没有真正说什么。


使用set statistics io,我的查询使用了


表''数据''。扫描计数151,逻辑读取302,物理读取0,

预读读取0.


并使用您的代码


表''#temp __(snip)''。扫描计数0,逻辑读取86,物理读取0,

预读读取0.

表''数据''。扫描计数43,逻辑读取86,物理读取0,

预读读取0.


表''#temp __(snip)''。扫描计数85,逻辑读取254,物理读取0,

预读读取0.

表''工作表''。扫描计数1,逻辑读取44,物理读取0,

预读读取0.


表''#temp __(snip)''。扫描计数25,逻辑读取98,物理读取0,

预读读取0.

表''工作表''。扫描计数1,逻辑读取26,物理读取0,

预读读取0.


表''#temp __(snip)''。扫描计数1,逻辑读取2,物理读取0,

预读读取0.


添加这些数字表明您的版本看起来相当一些

比我的版本更多的工作。


我有兴趣听听Ilik的版本比较他的

系统,带有真实数据。


关于日期运动的特别说明。我每分钟做一次,不是每小时一次。这是因为约瑟夫(HOUR,''00:15:00'','12:10:10'')是12,虽然它不是完全12小时。约会总是算是跨界。另外,我发现coalesce需要特别照顾。
最初我有19000101和99991231,但这让我溢出。




为什么你不是使用DATEADD(小时,12,a.time)>那时候呢?难道不是吗?
已经解决了所有这些问题?


Best,Hugo

-


(删除_NO_和_SPAM_以获取我的电子邮件地址)


Hi all
I''m trying to create a summery table of rain data that holds a record
for every 0.1mm of rain in the following format:

Station Name, Time, Value
A, 2002-12-03 14:44:41.000, 0.1
A, 2002-12-03 14:45:45.000, 0.1
A, 2002-12-03 14:49:45.000, 0.1
A, 2002-12-09 05:30:35.000, 0.1
A, 2002-12-09 05:30:37.000, 0.1
A, 2002-12-09 05:33:05.000, 0.1
B, 2002-12-09 05:32:47.000, 0.1
B, 2002-12-09 05:33:00.000, 0.1
B, 2002-12-09 05:35:00.000, 0.1
B, 2002-12-09 05:37:48.000, 0.1
....
B, 2003-02-09 01:32:47.000, 0.1
B, 2003-02-09 05:32:47.000, 0.1

The problem is that I need to count rain events for each station and
produce the starting and ending time of the event (based on the Time
field) and to summarize the amount of rain that fell during the event.
An "event" for that matter is a case where there is a gap of at least
12 hours with no rain before and after a series of rain measurements
(i.e., a rain storm). The data in the table is written in such a way
that each row hold the time where an amount on 0.1mm of rain was
counted by the rain gauge.

For example, the above date should produce:

Station Name, Start Time, End Time, Total rain
A, 2002-12-03 14:44:41.000, 2002-12-03 14:49:45.000, 0.3
A, 2002-12-09 05:30:35.000, 2002-12-09 05:33:05.000, 0.3
B, 2002-12-09 05:32:47.000, 2002-12-09 05:37:48.000, 0.3
B, 2003-02-09 01:32:47.000, 2003-02-09 05:32:47.000, 0.2

As a newbie to SQL I don''t know if it can be done.

Many thanks,
Ilik

解决方案

On 14 Feb 2005 08:00:24 -0800, Ilik wrote:

Hi all
I''m trying to create a summery table of rain data that holds a record
for every 0.1mm of rain in the following format:

Station Name, Time, Value
A, 2002-12-03 14:44:41.000, 0.1
A, 2002-12-03 14:45:45.000, 0.1
A, 2002-12-03 14:49:45.000, 0.1
A, 2002-12-09 05:30:35.000, 0.1
A, 2002-12-09 05:30:37.000, 0.1
A, 2002-12-09 05:33:05.000, 0.1
B, 2002-12-09 05:32:47.000, 0.1
B, 2002-12-09 05:33:00.000, 0.1
B, 2002-12-09 05:35:00.000, 0.1
B, 2002-12-09 05:37:48.000, 0.1
...
B, 2003-02-09 01:32:47.000, 0.1
B, 2003-02-09 05:32:47.000, 0.1

The problem is that I need to count rain events for each station and
produce the starting and ending time of the event (based on the Time
field) and to summarize the amount of rain that fell during the event.
An "event" for that matter is a case where there is a gap of at least
12 hours with no rain before and after a series of rain measurements
(i.e., a rain storm). The data in the table is written in such a way
that each row hold the time where an amount on 0.1mm of rain was
counted by the rain gauge.

For example, the above date should produce:

Station Name, Start Time, End Time, Total rain
A, 2002-12-03 14:44:41.000, 2002-12-03 14:49:45.000, 0.3
A, 2002-12-09 05:30:35.000, 2002-12-09 05:33:05.000, 0.3
B, 2002-12-09 05:32:47.000, 2002-12-09 05:37:48.000, 0.3
B, 2003-02-09 01:32:47.000, 2003-02-09 05:32:47.000, 0.2

As a newbie to SQL I don''t know if it can be done.



Hi Ilik,

It can be done, but it''s quite complicated. You have to use several
tricks, then combine them into one query.

Trick 1: Find the start of a rain period. Each row for which no other row
exists with a time value in the preceding 12 hours marks the start of a
rain period. You can find these using NOT EXISTS or using an OUTER JOIN.
SELECT Station
, Time
FROM MyTable AS a
WHERE NOT EXISTS (SELECT *
FROM MyTable AS b
WHERE b.Station = a.Station
AND b.Time < a.Time
AND b.Time >= DATEADD(hour, -12, a.Time))
or
SELECT a.Station
, a.Time
FROM MyTable AS a
LEFT OUTER JOIN MyTable AS b
ON b.Station = a.Station
AND b.Time < a.Time
AND b.Time >= DATEADD(hour, -12, a.Time)
I''ll use the NOT EXISTS version in the remainder of this message; if
performance is important for you, I''d advise you to test both versions.

Trick 2: Find the end of a rain period. This is basically the same
technique as trick 1.

Trick 3: Match up the start and end times of each rain period. For each
start, the matching end is the FIRST of all end''s that occur AFTER that
start.
SELECT a.Station
, a.Time AS StartTime
, MIN(b.Time) AS EndTime
FROM MyTable AS a
INNER JOIN MyTable AS b
ON b.Station = a.Station
AND b.Time > a.Time
WHERE NOT EXISTS (SELECT *
FROM MyTable AS c
WHERE c.Station = a.Station
AND c.Time < a.Time
AND c.Time >= DATEADD(hour, -12, a.Time))
AND NOT EXISTS (SELECT *
FROM MyTable AS d
WHERE d.Station = b.Station
AND d.Time > b.Time
AND d.Time <= DATEADD(hour, +12, b.Time))
GROUP BY a.Station
, a.Time

Trick 4: Find all rows between start and end time of a rain period. This
can be done by using the result of the previous step as a derived table
and joining that to the original table, but in this case, I decided to
extend the result of the previous step with one correlated subquery.
SELECT a.Station
, a.Time AS StartTime
, MIN(b.Time) AS EndTime
,(SELECT SUM(e.Value)
FROM MyTable AS e
WHERE e.Station = a.Station
AND e.Time >= a.Time
AND e.Time <= MIN(b.Time)) AS TotalRain
FROM MyTable AS a
INNER JOIN MyTable AS b
ON b.Station = a.Station
AND b.Time > a.Time
WHERE NOT EXISTS (SELECT *
FROM MyTable AS c
WHERE c.Station = a.Station
AND c.Time < a.Time
AND c.Time >= DATEADD(hour, -12, a.Time))
AND NOT EXISTS (SELECT *
FROM MyTable AS d
WHERE d.Station = b.Station
AND d.Time > b.Time
AND d.Time <= DATEADD(hour, +12, b.Time))
GROUP BY a.Station
, a.Time

Note: all the queries above are untested. To get tested replies, you need
to post SQL that will recreate your tables and data on my system, as shown
in www.aspfaq.com/5006.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)


[posted and mailed, please reply in news]

Ilik (le***@walla.co.il) writes:

The problem is that I need to count rain events for each station and
produce the starting and ending time of the event (based on the Time
field) and to summarize the amount of rain that fell during the event.
An "event" for that matter is a case where there is a gap of at least
12 hours with no rain before and after a series of rain measurements
(i.e., a rain storm). The data in the table is written in such a way
that each row hold the time where an amount on 0.1mm of rain was
counted by the rain gauge.



By no means a trivial problem. As Hugo said, CREATE TABLE and INSERT
statements are alwyas a good thing. Then again, the table was easy
to compose in this case, as were the INSERT statement. And your
narrative and sample data, was very good. Thanks!

Here is a multi-step approach. Maybe not the most elegant, but it
gives the correct result. If data volumes are large, performance
could be an issue, though.

A particular note on the datediff exercise. I do datediff per minute,
not per hour. This is because datediff(HOUR, ''00:15:00'', ''12:10:10'')
is 12, although it''s not fully 12 hours. datediff always counts
cross boundaries. Also, I found that coalesce needed special care.
Originally I had 19000101, and 99991231, but that gave me overflow.

CREATE TABLE data(station char(1) NOT NULL,
time datetime NOT NULL,
value float NOT NULL,
CONSTRAINT pk_data PRIMARY KEY (station, time))
go
INSERT data (station, time, value)
SELECT ''A'', ''20021203 14:44:41.000'', 0.1 UNION
SELECT ''A'', ''20021203 14:45:45.000'', 0.1 UNION
SELECT ''A'', ''20021203 14:49:45.000'', 0.1 UNION
SELECT ''A'', ''20021209 05:30:35.000'', 0.1 UNION
SELECT ''A'', ''20021209 05:30:37.000'', 0.1 UNION
SELECT ''A'', ''20021209 05:33:05.000'', 0.1 UNION
SELECT ''B'', ''20021209 05:32:47.000'', 0.1 UNION
SELECT ''B'', ''20021209 05:33:00.000'', 0.1 UNION
SELECT ''B'', ''20021209 05:35:00.000'', 0.1 UNION
SELECT ''B'', ''20021209 05:37:48.000'', 0.1 UNION
SELECT ''B'', ''20030209 01:32:47.000'', 0.1 UNION
SELECT ''B'', ''20030209 05:32:47.000'', 0.1 UNION
SELECT ''B'', ''20030209 18:32:47.000'', 0.1 UNION
SELECT ''B'', ''20030212 05:32:47.000'', 0.1
go
CREATE TABLE #temp (station char(1) NOT NULL,
time datetime NOT NULL,
n int NOT NULL,
starttime datetime NULL,
endtime datetime NULL,
value float NOT NULL,
PRIMARY KEY (station, n))
go
INSERT #temp (station, time, value, n)
SELECT a.station, a.time, a.value,
(SELECT COUNT(*) + 1 FROM data b
WHERE a.station = b.station AND a.time > b.time)
FROM data a
go
UPDATE b
SET starttime = CASE WHEN datediff(minute,
coalesce(a.time, dateadd(DAY, -1, b.time)),
b.time) > 60*12
THEN b.time
END,
endtime = CASE WHEN datediff(minute,
b.time,
coalesce(c.time, dateadd(DAY, 1, b.time))) > 60*12
THEN b.time
END
FROM #temp b
LEFT JOIN #temp a ON a.station = b.station AND b.n - 1 = a.n
LEFT JOIN #temp c ON b.station = c.station AND b.n + 1 = c.n
go
UPDATE #temp
SET starttime = (SELECT MAX(b.starttime)
FROM #temp b
WHERE b.station = a.station
AND b.starttime < a.time)
FROM #temp a
WHERE a.starttime IS NULL
go
SELECT station, starttime, MAX(endtime), SUM(value)
FROM #temp
GROUP BY station, starttime
go
DROP TABLE #temp
DROP TABLE data
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp


On Tue, 15 Feb 2005 21:28:13 +0000 (UTC), Erland Sommarskog wrote:

By no means a trivial problem. As Hugo said, CREATE TABLE and INSERT
statements are alwyas a good thing. Then again, the table was easy
to compose in this case, as were the INSERT statement.
Hi Erland,

You''re absolutely right. I guess this just shows that I''m lazy ;-)

I just stole your work to test my query and found one small error. The
corrected version (adapted to the table and column names you''ve chosen) is

SELECT a.station
, a.time AS Starttime
, MIN(b.time) AS Endtime
,(SELECT SUM(e.value)
FROM data AS e
WHERE e.station = a.station
AND e.time >= a.time
AND e.time <= MIN(b.time)) AS TotalRain
FROM data AS a
INNER JOIN data AS b
ON b.station = a.station
AND b.time >= a.time -- The change is in this line
WHERE NOT EXISTS (SELECT *
FROM data AS c
WHERE c.station = a.station
AND c.time < a.time
AND c.time >= DATEADD(hour, -12, a.time))
AND NOT EXISTS (SELECT *
FROM data AS d
WHERE d.station = b.station
AND d.time > b.time
AND d.time <= DATEADD(hour, +12, b.time))
GROUP BY a.station
, a.time
And your
narrative and sample data, was very good. Thanks!
I can only second that.

Here is a multi-step approach. Maybe not the most elegant, but it
gives the correct result. If data volumes are large, performance
could be an issue, though.
I ran some quick tests for a first impression. Based on the difference
from start to end time for both queries, using the sample data provided,
it was too close to call. Increasing the amount of test data (copying the
insert two more times and changing the station names to C, D, E and F)
resulted in a small advantage for my query, but with so little test data,
that is not really saying anything.

With set statistics io on, my query used

Table ''data''. Scan count 151, logical reads 302, physical reads 0,
read-ahead reads 0.

And your code used

Table ''#temp__(snip)''. Scan count 0, logical reads 86, physical reads 0,
read-ahead reads 0.
Table ''data''. Scan count 43, logical reads 86, physical reads 0,
read-ahead reads 0.

Table ''#temp__(snip)''. Scan count 85, logical reads 254, physical reads 0,
read-ahead reads 0.
Table ''Worktable''. Scan count 1, logical reads 44, physical reads 0,
read-ahead reads 0.

Table ''#temp__(snip)''. Scan count 25, logical reads 98, physical reads 0,
read-ahead reads 0.
Table ''Worktable''. Scan count 1, logical reads 26, physical reads 0,
read-ahead reads 0.

Table ''#temp__(snip)''. Scan count 1, logical reads 2, physical reads 0,
read-ahead reads 0.

Adding up these numbers shows that your version appears to do quite some
more work than my version.

I''d be interested to hear from Ilik how the versions compare on his
system, with real data.

A particular note on the datediff exercise. I do datediff per minute,
not per hour. This is because datediff(HOUR, ''00:15:00'', ''12:10:10'')
is 12, although it''s not fully 12 hours. datediff always counts
cross boundaries. Also, I found that coalesce needed special care.
Originally I had 19000101, and 99991231, but that gave me overflow.



Why didn''t you use DATEADD(hour, 12, a.time) > b.time, then? Wouldn''t that
have solved all these issues?

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)


这篇关于有条件的总结的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆