用 pandas 填充信号时保留原始数据点 [英] Keep original data points when padding a signal with pandas
问题描述
请考虑以下测试数据集:
Consider the following test data set:
testdf = pandas.DataFrame({'t': [datetime(2015, 1, 1, 10, 0),
datetime(2015, 1, 1, 11, 32),
datetime(2015, 1, 1, 12, 0)],
'val': [1, 2, 3]})
我想使用简单的填充值对数据集进行插值,以使我至少每30分钟就有一个数据点,同时保留原始数据点.
I would like to interpolate this data set using simple padding, such that I have a data point at least every 30 mins, while keeping the original data points.
合适的结果如下:
't' 'val'
2015-01-01 10:00 1
2015-01-01 10:30 1
2015-01-01 11:00 1
2015-01-01 11:30 1
2015-01-01 11:32 2
2015-01-01 12:00 3
哪种方法是最好的方法,最好使用标准的熊猫方法?
Which would be a good way of achieving this result, preferably using standard pandas methods?
我知道DataFrame.resample
方法,但是
a)我似乎找不到正确的how
参数值来实现所需的简单填充,并且
a) I can't seem to find the right values of the how
parameter to achieve the desired simple padding, and
b)我找不到在结果中保留原始数据点的方法.
b) I can't find a way to keep the original data points in the result.
当然可以通过手动将原始数据点添加到重新采样的DataFrame来规避问题b),尽管我不会将其称为一种特别简洁的解决方案.
Problem b) could of course be circumvented by manually adding the original data points to the resampled DataFrame, although I wouldn't call this a particularly neat solution.
推荐答案
使用缺少的时间戳生成索引,并创建具有NaN
值的数据框.然后将其与combine_first
方法结合并填写NaN
值:
Generate an index with the missing timestamps and create a dataframe with NaN
values. Then combine it with the combine_first
method and fill in the NaN
values:
idx = pandas.date_range(datetime(2015, 1, 1, 10, 0), datetime(2015, 1, 1, 12, 0), freq='30min')
df = pandas.DataFrame(numpy.nan, index=idx, columns=['val'])
testdf.set_index('t', inplace=True)
testdf.combine_first(df).fillna(method='ffill')
combine_first
方法的文档读取:
合并两个DataFrame对象,并在框架中默认为非null值 调用方法.结果索引列将是 各自的索引和列
Combine two DataFrame objects and default to non-null values in frame calling the method. Result index columns will be the union of the respective indexes and columns
fillna
方法的ffill
方法执行以下操作(源):
The ffill
method of the fillna
method does the following (source):
填充:将最后一个有效观察结果传播到下一个有效回填
ffill: propagate last valid observation forward to next valid backfill
这篇关于用 pandas 填充信号时保留原始数据点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!