如何找到数据空白并插入NULL数据点而不是间隙 [英] How to find gaps of data and insert NULL data points instead having gap

查看:102
本文介绍了如何找到数据空白并插入NULL数据点而不是间隙的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个列集(我们假设列A和列B)。
每列可以有多个点(datetime和value)。
每个列集都具有预定义的时间步长。以下示例是:600秒。



columnA

  dateTime  - '2013-10-05 00:10:00'
价值 - 0.51231
dateTime - '2013-10-05 00:20:00'
价值 - 0.54123
dateTime - '2013-10-05 00:30:00'
价值 - 0.53312
...
dateTime - '2013-11-14 13:30:00'
价值 - 0.53312


columnB

  dateTime  - '2013-10-05 00:10:00'
价值 - 0.51231
dateTime - '2013-10-05 00:20:00'
价值 - 0.54123
dateTime - '2013-10-05 00:50:00'
价值 - 0.54123
...
dateTime - '2013-11-14 13:20:00'
价值 - 0.53312
dateTime - '2013-11-14 13:30:00'
价值 - 0.53312
dateTime - '2013-11-14 13:40:00'
价值 - 0.53312

我需要为每个列找到缺少的数据点,并插入NULL分,所以所有的列都将具有对于上述例子,缺少的数据点是:

  columnA  -  2013-11-14 13:40:00 
columnB - 2013-10-05 00:30:00,2013-10-05 00:40:00

我知道我可以使用T-SQL或PHP(客户端)对给定列集的每一列进行迭代,并检查每个点,但是会我觉得很慢特别是PHP解决方案。



可能有智能SQL可以选择缺少的dateTimes(差距)。



任何想法?



更新:
因为DB结构不清楚。
列是表中的一个字段,所以我们有一个表:tblValues。
有3个字段:columnName,datetime和value。
在我们的例子中,columnName将是columnA或columnB。

解决方案

这不是太糟糕。这里的关键是保留数字表。它只是一个表,一列,数字1到尽可能多的你想要的。它们非常有用。这是我最喜欢的创建方法(由Itzik Ben-Gan提供) :

 ; WITH 
Pass0 as(select 1 as C union all select 1),--2 rows
Pass1为(选择1作为C从Pass0作为A,Pass0为B), - 4行
Pass2为(选择1为C,从Pass1为A,Pass1为B), - 16行
Pass3为(从Pass2中选择1作为A,Pass2为B), - 256行
Pass4为(从C中选择1作为A,Pass3为B), - 65536行
Pass5为(选择1作为C从Pass4作为A,Pass4为B), - 4,294,967,296行
计数为(select row_number()((C))作为Number从Pass5)
选择编号INTO dbo.来自Tally的数字号码< = 1000000

现在你已经有了,这只是使用数字表来生成数据应该的位置,然后将其连接到您实际拥有的位置。这样做:

  WITH [data] AS(SELECT * FROM(VALUES 
('2013-10-05 00:10:00',0.51231),
('2013-10-05 00:20:00',0.54123),
('2013-10-05 00:30:00',0.53312 )
('2013-10-05 00:50:00',0.54123)
)AS x(d,v)),
[times] AS(
SELECT TOP 10 DATEADD(MINUTE,10 * [Number],'2013-10-05')AS [d]
FROM dba.dbo。[Numbers] AS n

SELECT * FROM [times]
LEFT JOIN [data]
ON [times] .d = [data] .d
WHERE [data] .d IS NULL


I have a column set (let's say column A and column B). Each column can have multiple points (datetime and value). Each column set have predefined timestep. For below example it's: 600sec.

columnA

dateTime - '2013-10-05 00:10:00' 
Value - 0.51231
dateTime - '2013-10-05 00:20:00' 
Value - 0.54123
dateTime - '2013-10-05 00:30:00' 
Value - 0.53312
...
dateTime - '2013-11-14 13:30:00' 
Value - 0.53312

etc.

columnB

dateTime - '2013-10-05 00:10:00' 
Value - 0.51231
dateTime - '2013-10-05 00:20:00' 
Value - 0.54123
dateTime - '2013-10-05 00:50:00' 
Value - 0.54123
...
dateTime - '2013-11-14 13:20:00' 
    Value - 0.53312
dateTime - '2013-11-14 13:30:00' 
Value - 0.53312
dateTime - '2013-11-14 13:40:00' 
Value - 0.53312

I need to find missing data points for each column and insert NULL points so all columns would have exact amount of points.

For above examples missing data points are:

columnA - 2013-11-14 13:40:00
columnB - 2013-10-05 00:30:00, 2013-10-05 00:40:00

I know I can iterate using T-SQL or PHP (client side) for each column for given column set and check each point, but it would be very slow I think. Especially PHP solution.

There is possibly smart SQL that would be able to select missing dateTimes (gaps).

Any ideas?

UPDATE: Because it's not clear about the DB structure. Column is a field within the table, so, let's have a table: tblValues. There are 3 fields: columnName, datetime and value. In our examples columnName would be the columnA or columnB.

解决方案

It's not too bad. The key here is to keep a numbers table around. It's just a table with one column and the numbers 1 through as many as you want in it. They're incredibly useful. Here's my favorite method to create one (courtesy of Itzik Ben-Gan):

;WITH
  Pass0 as (select 1 as C union all select 1), --2 rows
  Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows
  Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows
  Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows
  Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows
  Pass5 as (select 1 as C from Pass4 as A, Pass4 as B),--4,294,967,296 rows
  Tally as (select row_number() over(order by C) as Number from Pass5)
 select Number INTO dbo.Numbers from Tally where Number <= 1000000

Now that you've got that, it's just a matter of using the numbers table to generate where the data should be and then left join it to what you actually have. Something like this:

WITH [data] AS (SELECT * FROM (VALUES 
    ('2013-10-05 00:10:00', 0.51231),
    ('2013-10-05 00:20:00', 0.54123),
    ('2013-10-05 00:30:00', 0.53312),
    ('2013-10-05 00:50:00', 0.54123)
) AS x(d, v)),
[times] AS (
    SELECT TOP 10 DATEADD(MINUTE, 10*[Number], '2013-10-05') AS [d]
    FROM dba.dbo.[Numbers] AS n
)
SELECT * FROM [times] 
LEFT JOIN [data]
    ON [times].d = [data].d
WHERE [data].d IS NULL

这篇关于如何找到数据空白并插入NULL数据点而不是间隙的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆