重新采样,汇总和插值TimeSeries趋势数据 [英] Resample, aggregate, and interpolate of TimeSeries trend data

查看:32
本文介绍了重新采样,汇总和插值TimeSeries趋势数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在分析能源需求和消耗数据时,我遇到了重新采样和插值时间序列趋势数据的问题.

In analysis of energy demand and consumption data, I'm having issue re-sampling and interpolating time series trended data.

数据集示例:

timestamp                value kWh
------------------       ---------
12/19/2011 5:43:21 PM    79178
12/19/2011 5:58:21 PM    79179.88
12/19/2011 6:13:21 PM    79182.13
12/19/2011 6:28:21 PM    79183.88
12/19/2011 6:43:21 PM    79185.63

基于这些观察,我希望一些聚合可以基于时间段来汇总值,并将该频率设置为时间单位.

Based upon these observations, I'd like some aggregation to roll-up values based upon a period of time, with that frequency set to a unit of time.

就像这样,每小时的时间间隔会填补缺失数据的空白

As in, intervals on the hour filling any gaps of missing data

timestamp                value (approx)
------------------       ---------
12/19/2011 5:00:00 PM    79173
12/19/2011 6:00:00 PM    79179
12/19/2011 7:00:00 PM    79186

对于线性算法,似乎我会花费时间上的差并将该值乘以该因子.

For a linear algorithm, it seems I would take the difference in time and multiply the value against that factor.

TimeSpan ts = current - previous;

Double factor = ts.TotalMinutes / period;

可以根据该因素计算值和时间戳.

Value and timestamp could be calculated based upon the factor.

有了如此大量的可用信息,我不确定为什么很难找到最优雅的方法.

With such quantity of available information, I'm unsure why it's difficult to find the most elegant approach to this.

也许首先,有没有可以推荐的开源分析库?

Perhaps first, are there open source analysis libraries that could be recommended?

对程序化方法有何建议?理想情况下是C#,还是可能使用SQL?

Any recommendations for a programmatic approach? Ideally C#, or possibly with SQL?

或者,我可以指出任何类似的问题(带有答案)?

Or, any similar questions (with answers) I could be pointed to?

推荐答案

通过使用内部用于表示DateTime的时间标记,可以得到最准确的值.由于这些时间间隔不会在午夜零时重新开始,因此您在日界不会有问题.

By using the time-ticks that are used internally to represent DateTimes, you get the most accurate values that are possible. Since these time ticks do not restart at zero at midnight, you will not have problems at day boundaries.

// Sample times and full hour
DateTime lastSampleTimeBeforeFullHour = new DateTime(2011, 12, 19, 17, 58, 21);
DateTime firstSampleTimeAfterFullHour = new DateTime(2011, 12, 19, 18, 13, 21);
DateTime fullHour = new DateTime(2011, 12, 19, 18, 00, 00);

// Times as ticks (most accurate time unit)
long t0 = lastSampleTimeBeforeFullHour.Ticks;
long t1 = firstSampleTimeAfterFullHour.Ticks;
long tf = fullHour.Ticks;

// Energy samples
double e0 = 79179.88; // kWh before full hour
double e1 = 79182.13; // kWh after full hour
double ef; // interpolated energy at full hour

ef = e0 + (tf - t0) * (e1 - e0) / (t1 - t0); // ==> 79180.1275 kWh

公式的说明
在几何中,相似的三角形是形状相同但大小不同的三角形.上面的公式基于这样一个事实,即一个三角形中任意两个边的比率对于相似三角形的对应边都是相同的.

Explanation of the formula
In geometry, similar triangles are triangles that have the same shape but different sizes. The formula above is based on the fact that the ratios of any two sides in one triangle are the same for the corresponding sides of a similar triangle.

如果您有一个三角形A B C和一个相似的三角形a b c,则 A:B = a:b .两个比率的相等称为比例.

If you have a triangle A B C and a similar triangle a b c, then A : B = a : b. The equality of two ratios is called a proportion.

我们可以将比例规则应用于我们的问题:

We can apply this proportionality rule to our problem:

(e1 – e0) / (t1 – t0) = (ef – e0) / (tf – t0)
--- large triangle --   --- small triangle --

这篇关于重新采样,汇总和插值TimeSeries趋势数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆