根据日期和月份从另一个数据框索引重新排列数据框中的组 [英] rearrange groups in dataframe based on day and month from another dataframe index
本文介绍了根据日期和月份从另一个数据框索引重新排列数据框中的组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有2个数据框:
df_a
datetime var
2016-10-15 110.232790
2016-10-16 111.020661
2016-10-17 112.193496
2016-10-18 113.638143
2016-10-19 115.241448
2017-01-01 113.638143
2017-01-02 115.241448
和df_b
datetime var
2000-01-01 165.792185
2000-01-02 166.066959
2000-01-03 166.411669
2000-01-04 167.816046
2000-01-05 169.777814
2000-10-15 114.232790
2000-10-16 113.020661
2001-01-01 164.792185
2001-01-02 161.066959
2001-01-03 156.411669
2002-01-04 167.816046
2002-01-05 169.777814
2002-10-15 174.232790
2003-10-16 114.020661
df_a具有2016年,2017年的信息,而df_b具有2000年至2015年的信息(这些年没有重叠).
df_a has information for the year 2016, 2017 and df_b has information for years from 2000 to 2015 (there is no overlap in the years).
我可以将df_b数据框中的每个组安排为与df_a的日期顺序相同吗?一组定义为具有相同年份的行,例如2000
Can I arrange each group in the df_b dataframe to have the same order in terms of day of year as df_a? A group is defined as rows with the same year e.g. 2000
推荐答案
You can chain new condition for check year
:
df = df_b[df_b.index.month.isin(df_a.index.month) &
df_b.index.day.isin(df_a.index.day) &
(df_b.index.year == 2000)]
print (df)
var
datetime
2000-01-01 165.792185
2000-01-02 166.066959
2000-10-15 114.232790
2000-10-16 113.020661
df = df_b[df_b.index.month.isin(df_a.index.month) & df_b.index.day.isin(df_a.index.day)]
print (df)
var
datetime
2000-01-01 165.792185
2000-01-02 166.066959
2000-10-15 114.232790
2000-10-16 113.020661
2001-01-01 164.792185
2001-01-02 161.066959
2002-10-15 174.232790
2003-10-16 114.020661
#create dictionary of weights by factorize
a = pd.factorize(df_a.index.strftime('%m-%d'))
d = dict(zip(a[1], a[0]))
print (d)
{'01-02': 6, '10-19': 4, '10-18': 3, '10-15': 0, '01-01': 5, '10-16': 1, '10-17': 2}
#ordering Series, multiple by 1000 becasue possible 1 to 366 MMDD
order = pd.Series(df.index.strftime('%m-%d'), index=df.index).map(d) + df.index.year * 1000
print (order)
datetime
2000-01-01 2000005
2000-01-02 2000006
2000-10-15 2000000
2000-10-16 2000001
2001-01-01 2001005
2001-01-02 2001006
2002-10-15 2002000
2003-10-16 2003001
Name: datetime, dtype: int64
最后一个 reindex
排序的order
索引:
df = df.reindex(order.sort_values().index)
print (df)
var
datetime
2000-10-15 114.232790
2000-10-16 113.020661
2000-01-01 165.792185
2000-01-02 166.066959
2001-01-01 164.792185
2001-01-02 161.066959
2002-10-15 174.232790
2003-10-16 114.020661
这篇关于根据日期和月份从另一个数据框索引重新排列数据框中的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文