pandas 无法在特定年份按季度过滤行 [英] Pandas unable to filter rows by quarter in specific year

查看:62
本文介绍了 pandas 无法在特定年份按季度过滤行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像下面的数据集-

I have a dataset like below-

  Store   Date     Weekly_Sales         
0   1   2010-05-02  1643690.90  
1   1   2010-12-02  1641957.44  
2   1   2010-02-19  1611968.17  
3   1   2010-02-26  1409727.59  
4   1   2010-05-03  1554806.68

总共有100家商店。我想按季度过滤2012年的数据

It has 100 stores in all. I want to filter the data of the year 2012 by Quarter

# Filter out only the data in 2012 from the dataset

import datetime as dt
df['Date'] = pd.to_datetime(df['Date'])
ds_2012 = df[df['Date'].dt.year == 2012]

# Calculate Q on the dataset
ds_2012 = ds_2012.sort_values(['Date'],ascending=True)
quarterly_sales = ds_2012.groupby(['Store', pd.Grouper(key='Date', freq='Q')])['Weekly_Sales'].sum()
quarterly_sales.head(20)

已收到输出

Store     Date      
1      2012-03-31    18951097.69
       2012-06-30    21036965.58
       2012-09-30    18633209.98
       2012-12-31     9580784.77

在Excel中过滤时,Q2(2012-06-30)和Q3(2012-09-30)的总和都不正确。我是熊猫的新手

The Summation of of Q2(2012-06-30) and Q3(2012-09-30) both are incorrect when filtered in excel. I am a newbie to Pandas

推荐答案

您可以 groupby store 重新采样 DataFrame季度:

You can groupby store and resample the DataFrame quarterly:

import pandas as pd
df=pd.concat([pd.DataFrame({'Store':[i]*12, 'Date':pd.date_range(start='2020-01-01', periods=12, freq='M'), 'Sales':list(range(12))}) for i in [1,2]])
df.groupby('Store').resample('Q', on='Date').sum().drop('Store', axis=1)

                  Sales
Store Date             
1     2020-03-31      3
      2020-06-30     12
      2020-09-30     21
      2020-12-31     30
2     2020-03-31      3
      2020-06-30     12
      2020-09-30     21
      2020-12-31     30

也许检查 groupby并重新采样文档

这篇关于 pandas 无法在特定年份按季度过滤行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆