数据操作开始日期结束日期python pandas [英] Data manipulation startdate enddate python pandas
本文介绍了数据操作开始日期结束日期python pandas的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个促销说明数据集,其中包含有关正在运行的各种促销及其开始日期-结束日期的信息:
I have a promotion description dataset with the information about various promotions running and their start date-end date:
promo item start_date end_date
Buy1-get 1 A 2015-01-08 2015-01-12
Buy1-get 1 A 2015-02-16 2015-02-20
Buy1-40% off B 2016-05-08 2016-05-09
现在,我想整理我的数据以进行后续分析,这样我只有一个带有促销信息的可变日期.
Now I want to organise my data for subsequent analysis such that I have only single variable date with promo information for it.
date item Promo
2015-01-08 A Buy1-get 1
2015-01-09 A Buy1-get 1
2015-01-10 A ......
2015-01-11 ....
2015-01-12
2015-02-16 A Buy1-get 1
2015-02-17 A Buy1-get 1
2015-02-18 .... .......
2015-02-19 .....
..........
2016-05-09 B Buy1-40% off
任何帮助将不胜感激.
推荐答案
您可以使用 concat
由date_range
与 join
列promo
和item
:
You can use concat
all Series
s created by date_range
with itertuples
and then join
columns promo
and item
:
df1 = pd.concat([pd.Series(r.Index,
pd.date_range(r.start_date,r.end_date)) for r in df.itertuples()])
.reset_index()
df1.columns = ['date','idx']
df1 = df1.set_index('idx')
df1 = df1.join(df[['item','promo']]).reset_index(drop=True)
print (df1)
date item promo
0 2015-01-08 A Buy1-get 1
1 2015-01-09 A Buy1-get 1
2 2015-01-10 A Buy1-get 1
3 2015-01-11 A Buy1-get 1
4 2015-01-12 A Buy1-get 1
5 2015-02-16 A Buy1-get 1
6 2015-02-17 A Buy1-get 1
7 2015-02-18 A Buy1-get 1
8 2015-02-19 A Buy1-get 1
9 2015-02-20 A Buy1-get 1
10 2016-05-08 B Buy1-40% off
11 2016-05-09 B Buy1-40% off
使用 melt
的另一种解决方案具有重采样的分组依据:
df1 = df.reset_index().rename(columns={'index':'idx'})
df1 = pd.melt(df1, id_vars='idx', value_vars=['start_date','end_date'], value_name='date')
.set_index('date')
df1 = df1.groupby('idx')
.resample('d')
.ffill()
.reset_index(level=1)
.drop(['idx','variable'], axis=1)
df1 = df1.join(df[['item','promo']]).reset_index(drop=True)
print (df1)
date item promo
0 2015-01-08 A Buy1-get 1
1 2015-01-09 A Buy1-get 1
2 2015-01-10 A Buy1-get 1
3 2015-01-11 A Buy1-get 1
4 2015-01-12 A Buy1-get 1
5 2015-02-16 A Buy1-get 1
6 2015-02-17 A Buy1-get 1
7 2015-02-18 A Buy1-get 1
8 2015-02-19 A Buy1-get 1
9 2015-02-20 A Buy1-get 1
10 2016-05-08 B Buy1-40% off
11 2016-05-09 B Buy1-40% off
这篇关于数据操作开始日期结束日期python pandas的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文