pandas 重复采样重复数据时间的组 [英] Pandas resample by groups with duplicate datetimes
问题描述
这里有很多类似的问题,但是我找不到有相同日期时间的观察结果。最小的非工作示例是:
Lots of similar questions on here, but I couldn't find any that actually had observations with the same datetime. A minimum non-working example would be:
df = pd.DataFrame(
{"Date": np.tile([pd.Series(["2016-01", "2016-03"])], 2)[0],
"Group": [1,1,2,2],
"Obs":[1,2,5,6]})
现在我想按组进行线性内插2016年2月的值,因此所需的输出为
Now I'd like to linearly interpolate the value for February 2016 by group, so the required output is
Date Group Obs
2016-01 1 1
2016-02 1 1.5
2016-03 1 2
2016-01 2 5
2016-02 2 5.5
2016-03 2 6
我的理解是, resample
应该能够这样做(在我的实际应用中,我试图从季度转到每月,所以在1月和4月有观察),但这需要某种时间索引,我不能做,因为在日期
列。
My understanding is that resample
should be able to do this (in my actual application I'm trying to move from quarterly to monthly, so have observations in Jan and Apr), but that requires some sort of time index, which I can't do as there are duplicates in the Date
column.
我假设某种 groupby
magic c可以帮助,但不能弄清楚!
I'm assuming some sort of groupby
magic could help, but can't figure it out!
推荐答案
修改:替换 resample
with reindex
提高2x速度。
Edit: replaced resample
with reindex
for a 2x speed improvement.
df.set_index('Date', inplace=True)
index = ['2016-01', '2016-02', '2016-03']
df.groupby('Group').apply(lambda df1: df1.reindex(index).interpolate())
使用 groupby
很容易,一旦你明白它只是返回一个数据框(这里 df1
)每个值在分组列。
Using groupby
is easy once you understand it just returns one dataframe (here df1
) per value in the grouping column.
这篇关于 pandas 重复采样重复数据时间的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!