绘制 pandas 中的groupby操作的结果 [英] Plot the result of a groupby operation in pandas

查看：145 发布时间：2018/5/30 14:15:51 python pandas matplotlib dataframe group-by

本文介绍了绘制 pandas 中的groupby操作的结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这个样本表：

  ID日期天日成交量/日
 0 111 2016-01-01 20 50 
 1 111 2016-02-01 25 40 
 2 111 2016-03-01 31 35 
 3 111 2016-04-01 30 30 
 4 111 2016-05 -01 31 25 
 5 111 2016-06-01 30 20 
 6 111 2016-07-01 31 20 
 7 111 2016-08-01 31 15 
 8 111 2016 -09-01 29 15 
 9 111 2016-10-01 31 10 
 10 111 2016-11-01 29 5 
 11 111 2016-12-01 27 0 
 0 112 2016-01-01 31 55 
 1 112 2016-02-01 26 45 
 2 112 2016-03-01 31 40 
 3 112 2016-04-01 30 35 
 4 112 2016-04-01 31 30 
 5 112 2016-05-01 30 25 
 6 112 2016-06-01 31 25 
 7 112 2016-07-01 31 20 
 8 112 2016-08-01 30 20 
 9 112 2016-09-01 31 15 
 10 112 2016-11-01 29 10 
 11 112 2016-12-01 31 0

我想通过ID和日期进行分组后，让我的表最终表如下所示。

  ID日期CumDays交易量/日
 0 111 2016-01-01 20 50 
 1 111 2016-02-01 45 40 
 2 111 2016 -03-01 76 35 
 3 111 2016-04-01 106 30 
 4 111 2016-05-01 137 25 
 5 111 2016-06-01 167 20 
 6 111 2016-07-01 198 20 
 7 111 2016-08-01 229 15 
 8 111 2016-09-01 258 15 
 9 111 2016-10-01 289 10 
 10 111 2016-11-01 318 5 
 11 111 2016-12-01 345 0 
 0 112 2016-01-01 31 55 
 1 112 2016-02-01 57 45 
 2 112 2016-03-01 88 40 
 3 112 2016-04-01 118 35 
 4 112 2016-05-01 149 30 
 5 112 2016-06-01 179 25 
 6 112 2016-07-01 210 25 
 7 112 2016-08-01 241 20 
 8 112 2016-09-01 271 20 
 9 112 2016-10- 01 302 15 
 10 112 2016-11-01 331 10 
 11 112 2016-12-01 362 0

接下来，我希望能够分机记录每个ID的每日体积/日期，所有CumDays值以及每个ID和日期的所有体积/日期值的第一个值。因此，我可以将它们用于进一步计算并绘制Volume / Day vs CumDays。 ID为111的例子，Volume / Day的第一个值将只有50，ID：112，它将只有55. ID：111的所有CumDays值将是20,45 ...和ID：112，它会是31,57 ...对于所有卷/天--- ID：111，将是50,40 ...和ID：112将是55,45 ...

我的解决方案：

  def get_time_rate（grp_df）：
t = grp_df ['Days']。cumsum （）'b $ br = grp_df ['Volume / Day'] 
 return t，r 
 
 vals = df.groupby（['ID'，'Date']）。apply get_time_rate）
 vals

这样做，累积计算完全不起作用。它返回原始的Days值。这不允许我进一步提取Volume / Day的第一个值，所有CumDays值和我需要的所有Volume / Day值。任何意见或帮助如何去解决它将不胜感激。谢谢

解决方案

获取 groupby 对象。

  g = df.groupby（'ID'）

使用 transform 计算列：

  df ['CumDays'] = g.Days.transform（'cumsum'）
 df ['First Volume / Day'] = g ['Volume / Day']。transform（'first'）
 df 
 
 ID日期天日成交量/日CumDays第一成交量/日
 0 111 2016-01-01 20 50 20 50 
 1 111 2016-02-01 25 40 45 50 
 2 111 2016-03-01 31 35 76 50 
 3 111 2016-04-01 30 30 106 50 
 4 111 2016-05-01 31 25 137 50 
 5 111 2016-06-01 30 20 167 50 
 6 111 2016-07-01 31 20 198 50 
 7 111 2016-08-01 31 15 229 50 
 8 111 2016-09-01 29 15 258 50 
 9 111 2016-10-01 31 10 289 50 
 10 111 2016-11-01 29 5 318 50 
 11 111 2016-12-01 27 0 345 50 
 0 112 2016-01-01 31 55 31 55 
 1 112 2016-01-02 26 45 57 55 
 2 112 2016- 01-03 31 40 88 55 
 3 112 2016-01-04 30 35 118 55 
 4 112 2016-01-05 31 30 149 55 
 5 112 2016-01-06 30 25 179 55 
 6 112 2016-01-07 31 25 210 55 
 7 112 2016-01-08 31 20 241 55 
 8 112 2016-01-09 30 20 271 55 
 9 112 2016-01-10 31 15 302 55 
 10 112 2016-01-11 29 10 331 55 
 11 112 2016-01-12 31 0 362 55

如果您想分组绘图，可以按照 ID 进行分组，。要绘制，首先设置索引并调用 plot 。
fig，ax = plt.subplots（figsize =（8,6）） for i，g in df2 .groupby（'ID'）： g.plot（x ='CumDays'，y ='Volume / Day'，ax = ax，label = str（i）） plt .show（）

I have this sample table:
ID Date Days Volume/Day 0 111 2016-01-01 20 50 1 111 2016-02-01 25 40 2 111 2016-03-01 31 35 3 111 2016-04-01 30 30 4 111 2016-05-01 31 25 5 111 2016-06-01 30 20 6 111 2016-07-01 31 20 7 111 2016-08-01 31 15 8 111 2016-09-01 29 15 9 111 2016-10-01 31 10 10 111 2016-11-01 29 5 11 111 2016-12-01 27 0 0 112 2016-01-01 31 55 1 112 2016-02-01 26 45 2 112 2016-03-01 31 40 3 112 2016-04-01 30 35 4 112 2016-04-01 31 30 5 112 2016-05-01 30 25 6 112 2016-06-01 31 25 7 112 2016-07-01 31 20 8 112 2016-08-01 30 20 9 112 2016-09-01 31 15 10 112 2016-11-01 29 10 11 112 2016-12-01 31 0
I'm trying to make my table final table look like this below after grouping by ID and Date.
ID Date CumDays Volume/Day 0 111 2016-01-01 20 50 1 111 2016-02-01 45 40 2 111 2016-03-01 76 35 3 111 2016-04-01 106 30 4 111 2016-05-01 137 25 5 111 2016-06-01 167 20 6 111 2016-07-01 198 20 7 111 2016-08-01 229 15 8 111 2016-09-01 258 15 9 111 2016-10-01 289 10 10 111 2016-11-01 318 5 11 111 2016-12-01 345 0 0 112 2016-01-01 31 55 1 112 2016-02-01 57 45 2 112 2016-03-01 88 40 3 112 2016-04-01 118 35 4 112 2016-05-01 149 30 5 112 2016-06-01 179 25 6 112 2016-07-01 210 25 7 112 2016-08-01 241 20 8 112 2016-09-01 271 20 9 112 2016-10-01 302 15 10 112 2016-11-01 331 10 11 112 2016-12-01 362 0
Next, I want to be able to extract the first value of Volume/Day per ID, all the CumDays values and all the Volume/Day values per ID and Date. So I can use them for further computation and plotting Volume/Day vs CumDays. Example for ID:111, the first value of Volume/Day will be only 50 and ID:112, it will be only 55. All CumDays values for ID:111 will be 20,45... and ID:112, it will be 31,57...For all Volume/Day --- ID:111, will be 50, 40... and ID:112 will be 55,45...

My solution:
def get_time_rate(grp_df): t = grp_df['Days'].cumsum() r = grp_df['Volume/Day'] return t,r vals = df.groupby(['ID','Date']).apply(get_time_rate) vals
Doing this, the cumulative calculation doesn't take effect at all. It returns the original Days value. This didn't allow me move further in extracting the first value of Volume/Day, all the CumDays values and all the Volume/Day values I need. Any advice or help on how to go about it will be appreciated. Thanks
解决方案
Get a groupby object.
g = df.groupby('ID')
Compute columns with transform:
df['CumDays'] = g.Days.transform('cumsum') df['First Volume/Day'] = g['Volume/Day'].transform('first') df ID Date Days Volume/Day CumDays First Volume/Day 0 111 2016-01-01 20 50 20 50 1 111 2016-02-01 25 40 45 50 2 111 2016-03-01 31 35 76 50 3 111 2016-04-01 30 30 106 50 4 111 2016-05-01 31 25 137 50 5 111 2016-06-01 30 20 167 50 6 111 2016-07-01 31 20 198 50 7 111 2016-08-01 31 15 229 50 8 111 2016-09-01 29 15 258 50 9 111 2016-10-01 31 10 289 50 10 111 2016-11-01 29 5 318 50 11 111 2016-12-01 27 0 345 50 0 112 2016-01-01 31 55 31 55 1 112 2016-01-02 26 45 57 55 2 112 2016-01-03 31 40 88 55 3 112 2016-01-04 30 35 118 55 4 112 2016-01-05 31 30 149 55 5 112 2016-01-06 30 25 179 55 6 112 2016-01-07 31 25 210 55 7 112 2016-01-08 31 20 241 55 8 112 2016-01-09 30 20 271 55 9 112 2016-01-10 31 15 302 55 10 112 2016-01-11 29 10 331 55 11 112 2016-01-12 31 0 362 55

If you want grouped plots, you can iterate over each groups after grouping by ID. To plot, first set index and call plot.
fig, ax = plt.subplots(figsize=(8,6)) for i, g in df2.groupby('ID'): g.plot(x='CumDays', y='Volume/Day', ax=ax, label=str(i)) plt.show()

这篇关于绘制 pandas 中的groupby操作的结果的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

绘制 pandas 中的groupby操作的结果 [英] Plot the result of a groupby operation in pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

绘制 pandas 中的groupby操作的结果 [英] Plot the result of a groupby operation in pandas

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭