pandas 无法在特定年份按季度过滤行 [英] Pandas unable to filter rows by quarter in specific year
本文介绍了 pandas 无法在特定年份按季度过滤行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个像下面的数据集-
I have a dataset like below-
Store Date Weekly_Sales
0 1 2010-05-02 1643690.90
1 1 2010-12-02 1641957.44
2 1 2010-02-19 1611968.17
3 1 2010-02-26 1409727.59
4 1 2010-05-03 1554806.68
总共有100家商店。我想按季度过滤2012年的数据
It has 100 stores in all. I want to filter the data of the year 2012 by Quarter
# Filter out only the data in 2012 from the dataset
import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])
ds_2012 = df[df['Date'].dt.year == 2012]
# Calculate Q on the dataset
ds_2012 = ds_2012.sort_values(['Date'],ascending=True)
quarterly_sales = ds_2012.groupby(['Store', pd.Grouper(key='Date', freq='Q')])['Weekly_Sales'].sum()
quarterly_sales.head(20)
已收到输出
Store Date
1 2012-03-31 18951097.69
2012-06-30 21036965.58
2012-09-30 18633209.98
2012-12-31 9580784.77
在Excel中过滤时,Q2(2012-06-30)和Q3(2012-09-30)的总和都不正确。我是熊猫的新手
The Summation of of Q2(2012-06-30) and Q3(2012-09-30) both are incorrect when filtered in excel. I am a newbie to Pandas
推荐答案
您可以 groupby store
和重新采样 DataFrame季度:
You can groupby store
and resample the DataFrame quarterly:
import pandas as pd
df=pd.concat([pd.DataFrame({'Store':[i]*12, 'Date':pd.date_range(start='2020-01-01', periods=12, freq='M'), 'Sales':list(range(12))}) for i in [1,2]])
df.groupby('Store').resample('Q', on='Date').sum().drop('Store', axis=1)
Sales
Store Date
1 2020-03-31 3
2020-06-30 12
2020-09-30 21
2020-12-31 30
2 2020-03-31 3
2020-06-30 12
2020-09-30 21
2020-12-31 30
也许检查 groupby并重新采样文档。
这篇关于 pandas 无法在特定年份按季度过滤行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文