用 pandas 填充信号时保留原始数据点 [英] Keep original data points when padding a signal with pandas

查看:88
本文介绍了用 pandas 填充信号时保留原始数据点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请考虑以下测试数据集:

Consider the following test data set:

testdf = pandas.DataFrame({'t': [datetime(2015, 1, 1, 10,  0),
                                 datetime(2015, 1, 1, 11, 32),
                                 datetime(2015, 1, 1, 12,  0)],
                           'val': [1, 2, 3]})

我想使用简单的填充值对数据集进行插值,以使我至少每30分钟就有一个数据点,同时保留原始数据点.

I would like to interpolate this data set using simple padding, such that I have a data point at least every 30 mins, while keeping the original data points.

合适的结果如下:

't'                'val'
2015-01-01 10:00   1
2015-01-01 10:30   1
2015-01-01 11:00   1
2015-01-01 11:30   1
2015-01-01 11:32   2
2015-01-01 12:00   3

哪种方法是最好的方法,最好使用标准的熊猫方法?

Which would be a good way of achieving this result, preferably using standard pandas methods?

我知道DataFrame.resample方法,但是

a)我似乎找不到正确的how参数值来实现所需的简单填充,并且

a) I can't seem to find the right values of the how parameter to achieve the desired simple padding, and

b)我找不到在结果中保留原始数据点的方法.

b) I can't find a way to keep the original data points in the result.

当然可以通过手动将原始数据点添加到重新采样的DataFrame来规避问题b),尽管我不会将其称为一种特别简洁的解决方案.

Problem b) could of course be circumvented by manually adding the original data points to the resampled DataFrame, although I wouldn't call this a particularly neat solution.

推荐答案

使用缺少的时间戳生成索引,并创建具有NaN值的数据框.然后将其与combine_first方法结合并填写NaN值:

Generate an index with the missing timestamps and create a dataframe with NaN values. Then combine it with the combine_first method and fill in the NaN values:

idx = pandas.date_range(datetime(2015, 1, 1, 10, 0), datetime(2015, 1, 1, 12, 0), freq='30min')
df = pandas.DataFrame(numpy.nan, index=idx, columns=['val'])

testdf.set_index('t', inplace=True)
testdf.combine_first(df).fillna(method='ffill')

combine_first方法的文档读取:

合并两个DataFrame对象,并在框架中默认为非null值 调用方法.结果索引列将是 各自的索引和列

Combine two DataFrame objects and default to non-null values in frame calling the method. Result index columns will be the union of the respective indexes and columns

fillna方法的ffill方法执行以下操作():

The ffill method of the fillna method does the following (source):

填充:将最后一个有效观察结果传播到下一个有效回填

ffill: propagate last valid observation forward to next valid backfill

这篇关于用 pandas 填充信号时保留原始数据点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆