根据涵盖多个月的年度回报期,对Pandas DataFrame进行分组 [英] Subset Pandas DataFrame based on annual returning period covering multiple months
问题描述
此问题类似于根据月份和月份选择Pandas DataFrame记录很多年.日期范围,但问题和答案似乎都无法解决我的情况
This question is similar to Selecting Pandas DataFrame records for many years based on month & day range, but both the question and answer doesn't seem to cover my case
import pandas as pd
import numpy as np
rng = pd.date_range('2010-1-1', periods=1000, freq='D')
df = pd.DataFrame(np.random.randn(len(rng)), index=rng, columns=['A'])
df.head()
A
2010-01-01 1.098302
2010-01-02 -1.384821
2010-01-03 -0.426329
2010-01-04 -0.587967
2010-01-05 -0.853374
现在,我想根据每年的年度回报期对DataFrame进行子集化. 例如,可以将时间段定义为2月15日至10月3日
Now I would like to subset my DataFrame based on an annual returning period for every year. A period can for example be defined as from February 15th to October 3rd
startMM, startdd = (2,15)
endMM, enddd = (10,3)
现在,我尝试根据此时间段对多年的DataFrame进行切片:
Now I tried to to slice my multi-year DataFrame based on this period:
subset = df[((df.index.month == startMM) & (startdd <= df.index.day)
| (df.index.month == endMM) & (df.index.day <= enddd))]
,但这仅返回startMM
和endMM
中定义的月份,而不返回日期之间的实际时间段.感谢您的帮助.
but this returns only the months as is defined in the startMM
and endMM
and not the actual period between the dates. Any help is kindly appreciated.
subset.index.month.unique()
Int64Index([2, 10], dtype='int64')
推荐答案
我将创建一列(month, day)
元组:
month_day = pd.concat([
df.index.to_series().dt.month,
df.index.to_series().dt.day
], axis=1).apply(tuple, axis=1)
然后您可以直接比较它们:
You can then compare them directly:
df[(month_day >= (startMM, startdd)) & (month_day <= (endMM, enddd))]
这篇关于根据涵盖多个月的年度回报期,对Pandas DataFrame进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!