如何使用 pandas 按星期分组数据透视表结果？ [英] How to use pandas to group pivot table results by week?

查看：123 发布时间：2018/5/30 13:48:29 python sql group-by pandas

本文介绍了如何使用 pandas 按星期分组数据透视表结果？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面是使用pandas pivot_table函数后以.csv格式输出的数据透视表输出片段：

 子产品11 / 1/12 11/2/12 11/3/12 11/4/12 11/5/12 11/6/12 
 GP收购164 168 54 72 203 167 
应用190 207 65 91 227 200 
 GPF收购1124 1142 992 1053 1467 1198 
 GPF申请1391 1430 1269 1357 1855 1510

我现在唯一需要做的就是在熊猫中使用groupby来将每个子产品的值按周汇总，然后将其输出到.csv文件中。

以下是我想要的输出，但它是在Excel中完成的。第一列可能不完全一样，但我很好。我需要做的主要事情是将这些日子按周分组，以便我可以按周将数据总和。（查看最上面一排按每7天分组的日期）。希望能够使用Python /熊猫来做到这一点。是否有可能？
行标签11/4/12 - 11/10/12 11/11/12 - 11/17 / 12 GP 收购926 728 申请1092 889 GPF 收购8206 6425 申请10527 8894

解决方案
您需要的工具是 resample ，隐式地在一段时间/频率上使用groupby，并应用像mean或sum这样的函数。

读取数据。

In [2]：df Out [2]：子产品11/1/12 11/2/12 11/3/12 11/4 / 12 11/5/12 11/6/12 GP收购164 168 54 72 203 167 GP应用190 207 65 91 227 200 GPF收购1124 1142 992 1053 1467 1198 GPF应用程序1391 1430 126 9 1357 1855 1510
建立一个MultiIndex。
在[4]中：df = df.reset_index（）。set_index（['index'，'Sub-Product']） In [5 ]：df 出[5]： 11/1/12 11/2/12 11/3/12 11/4/12 11/5/12 11/6/12 指数次级产品 GP收购164 168 54 72 203 167 申请190 207 65 91 227 200 GPF收购1124 1142 992 1053 1467 1198 申请1391 1430 1269 1357 1855 1510

将列解析为适当的日期时间。（它们以字符串的形式出现）。

在[6]中：df.columns = pd.to_datetime（df.columns）在[7]中：df 出[7]： 2012-11-01 2012-11-02 2012-11-03 2012-11-04 \ 指数次级产品 GP收购164 168 54 72 申请190 207 65 91 GPF收购1124 1142 992 1053 申请1391 1430 1269 1357 2012-11-05 2012-11-06 指数子产品 GP收购203 167 申请227 200 GPF收购1467 1198 申请1855 1510
重新取样列（ axis = 1 ）每周（'w'），按周计算。（ how ='sum'或 how = np.sum 在这里都是有效的选项。）
在[10]中：df.resample（'w'，how ='sum'，axis = 1） Out [10] ： 2012-11-04 2012-11-11 指数子产品 GP收购458 370 申请553 427 GPF收购4311 2665 申请5447 3365

Below is a snippet of my pivot table output in .csv format after using pandas pivot_table function:
Sub-Product 11/1/12 11/2/12 11/3/12 11/4/12 11/5/12 11/6/12 GP Acquisitions 164 168 54 72 203 167 GP Applications 190 207 65 91 227 200 GPF Acquisitions 1124 1142 992 1053 1467 1198 GPF Applications 1391 1430 1269 1357 1855 1510
The only thing I need to do now is to use groupby in pandas to sum up the values by week for each Sub Product before I output it to a .csv file.

Below is the output I want, but it is done in Excel. The first column might not be exactly the same but I am fine with that. The main thing I need to do is to group the days by week such that I can get sum of the data to be by week. (See how the top row has the dates grouped by every 7 days). Hoping to be able to do this using python/pandas. Is it possible?
Row Labels 11/4/12 - 11/10/12 11/11/12 - 11/17/12 GP Acquisitions 926 728 Applications 1092 889 GPF Acquisitions 8206 6425 Applications 10527 8894

解决方案
The tool you need is resample, which implicitly uses groupby over a time period/frequency and applies a function like mean or sum.

Read data.
In [2]: df Out[2]: Sub-Product 11/1/12 11/2/12 11/3/12 11/4/12 11/5/12 11/6/12 GP Acquisitions 164 168 54 72 203 167 GP Applications 190 207 65 91 227 200 GPF Acquisitions 1124 1142 992 1053 1467 1198 GPF Applications 1391 1430 1269 1357 1855 1510
Set up a MultiIndex.
In [4]: df = df.reset_index().set_index(['index', 'Sub-Product']) In [5]: df Out[5]: 11/1/12 11/2/12 11/3/12 11/4/12 11/5/12 11/6/12 index Sub-Product GP Acquisitions 164 168 54 72 203 167 Applications 190 207 65 91 227 200 GPF Acquisitions 1124 1142 992 1053 1467 1198 Applications 1391 1430 1269 1357 1855 1510
Parse the columns as proper datetimes. (They come in as strings.)
In [6]: df.columns = pd.to_datetime(df.columns) In [7]: df Out[7]: 2012-11-01 2012-11-02 2012-11-03 2012-11-04 \ index Sub-Product GP Acquisitions 164 168 54 72 Applications 190 207 65 91 GPF Acquisitions 1124 1142 992 1053 Applications 1391 1430 1269 1357 2012-11-05 2012-11-06 index Sub-Product GP Acquisitions 203 167 Applications 227 200 GPF Acquisitions 1467 1198 Applications 1855 1510
Resample the columns (axis=1) weekly ('w'), summing by week. (how='sum' or how=np.sum are both valid options here.)
In [10]: df.resample('w', how='sum', axis=1) Out[10]: 2012-11-04 2012-11-11 index Sub-Product GP Acquisitions 458 370 Applications 553 427 GPF Acquisitions 4311 2665 Applications 5447 3365

这篇关于如何使用 pandas 按星期分组数据透视表结果？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 pandas 按星期分组数据透视表结果？ [英] How to use pandas to group pivot table results by week?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 pandas 按星期分组数据透视表结果？ [英] How to use pandas to group pivot table results by week?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭