在Python中合并具有不同时间频率的系列/数据框 [英] merge series/dataframe with different time frequencies in python
问题描述
我正在尝试合并两个系列,一个小时和另一个分钟(并将索引保持在分钟级别):
i am trying to merge two series, one in hour and another in minute (and keep the index at minute level):
Series A:
time
2017-09-01 01:00:00 0.5
2017-09-01 02:00:00 0.4
Freq: H, Name: A, dtype: float64
B系列在分钟内:
Series B
time
2017-09-01 00:00:00 NaN
2017-09-01 00:03:00 -0.000350
2017-09-01 00:06:00 0.000401
Name: B, dtype: float64
我想按小时合并两个系列.所需的合并数据帧将是:
I want to merge the two series by hour. And the desired merged dataframe would be:
time
2017-09-01 00:00:00 NaN 0.5
2017-09-01 00:03:00 -0.000350 0.5
2017-09-01 00:06:00 0.000401 0.5
2017-09-01 01:00:00 0.002301 0.4
2017-09-01 01:03:00 0.005601 0.4
在SO上搜索之后,我发现的一种方法是在系列B中创建一个小时"列,然后使用A的索引和B的小时"列将A联接到B :(参考:
After searching on SO, one way I found is to create an 'hour' column in series B and then join A to B using the index of A and the 'hour' column of B: (reference: Merging/combining two dataframes with different frequency time series indexes in Pandas?)
B = B.to_frame()
B['hour'] = B.index.to_period('H')
merged = B.join(A, on = 'hour', how = 'left')
但是我收到错误消息:
TypeError: Argument 'values' has incorrect type (expected numpy.ndarray, got Index)
有人知道如何解决该问题吗?也许有更好的方法加入两个系列?非常感谢!
Does anyone know how to solve the issue? Or maybe there is a better way of joining the two series? Thanks a lot!
推荐答案
演示:
In [280]: A
Out[280]:
time
2017-09-01 01:00:00 0.5
2017-09-01 02:00:00 0.4
Name: val, dtype: float64
In [281]: B
Out[281]:
time
2017-09-01 00:00:00 NaN
2017-09-01 00:03:00 -0.000350
2017-09-01 00:06:00 0.000401
Name: val, dtype: float64
In [282]: B.to_frame('B').join(A.to_frame('A').set_index(A.index.shift(-1, freq='H')).resample('3T').ffill())
Out[282]:
B A
time
2017-09-01 00:00:00 NaN 0.5
2017-09-01 00:03:00 -0.000350 0.5
2017-09-01 00:06:00 0.000401 0.5
这篇关于在Python中合并具有不同时间频率的系列/数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!