pandas 分组 - 基于另一列的分组总计的百分比值 [英] Pandas Grouping - Values as Percent of Grouped Totals Based on Another Column

查看：355 发布时间：2017/3/26 3:50:14 python pandas dataframe aggregate aggregation

本文介绍了 pandas 分组 - 基于另一列的分组总计的百分比值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用数据框和熊猫，我想知道一个组中每个类别的提示百分比因此，使用提示数据库，我想看到，对于每个性别/吸烟者，女性吸烟者/所有女性和女性的提示百分比是多少？非吸烟者/所有女性（与男性相同）

当我这样做时，

  import pandas as pd 
 df = pd.read_csv（https://raw.githubusercontent.com/wesm/pydata-book/master/ch08/tips.csv，sep =' ，'）
 df.groupby（['sex'，'smoker']）[['total_bill'，'tip']]。sum（）

我得到以下内容：

  total_bill tips 
性吸烟者
女性否977.68 149.77 
是593.27 96.74 
男性否1919.75 302.00 
是1337.07 183.07

但我正在寻找对于像这样的东西

 提示Pct 
女性否0.153189183 
是0.163062349 
男性否0.15731215 
是0.136918785

其中Tip Pct = sum（tip）/ sum（total_bill）for每组

我做错了什么，我该如何解决？谢谢！

我知道这会给我提示总分的百分比：

 （df.groupby（['sex'，'smoker']）['tip']。sum（）。groupby（level = 0）.transform（lambda x：x / x.sum ）））

有没有办法修改它来查看另一个列，即

 （df.groupby（['sex'，'smoker']）['tip']。sum（）。groupby（level = 0 ）.transform（lambda x：x / x ['total_bill']。sum（）））

谢谢！

解决方案

您可以使用 apply 循环遍历数据框（与 axis = 1 ），每行可以访问提示和 total_bill 并将它们除以得到百分比：

 （df.groupby（['sex' ，'smoker']）[['total_bill'，'tip']]。sum（）
 .apply（lambda r：r.tip / r.total_bill，axis = 1））
 
 #sex smoker 
 #Female No 0.153189 
＃是0.163062 
 #Male否0.157312 
＃是0.136919 
 #dtype：float64

This question is an extension of a question I asked yesterday, but I will rephrase

Using a data frame and pandas, I am trying to figure out what the tip percentage is for each category in a group by.

So, using the tips database, I want to see, for each sex/smoker, what the tip percentage is is for female smoker / all female and for female non smoker / all female (and the same thing for men)

When I do this,

import pandas as pd
df=pd.read_csv("https://raw.githubusercontent.com/wesm/pydata-book/master/ch08/tips.csv", sep=',')
df.groupby(['sex', 'smoker'])[['total_bill','tip']].sum()

I get the following:

        total_bill  tip
sex smoker      
Female  No  977.68  149.77
        Yes 593.27  96.74
Male    No  1919.75 302.00
        Yes 1337.07 183.07

But I am looking for something more like this

        Tip Pct
Female  No  0.153189183
        Yes 0.163062349
Male    No  0.15731215
        Yes 0.136918785

Where Tip Pct = sum(tip)/sum(total_bill) for each group

What am I doing wrong and how do I fix this? Thank you!

I understand that this would give me tip as a percentage of total tips:

(df.groupby(['sex', 'smoker'])['tip'].sum().groupby(level = 0).transform(lambda x: x/x.sum()))

Is there a way to modify it to look at another column, i.e.

(df.groupby(['sex', 'smoker'])['tip'].sum().groupby(level = 0).transform(lambda x: x/x['total_bill'].sum()))

Thanks!

解决方案

You can use apply to loop through rows of the data frame (with axis = 1), where for each row you can access the tip and total_bill and divide them to get the percentage:

(df.groupby(['sex', 'smoker'])[['total_bill','tip']].sum()
   .apply(lambda r: r.tip/r.total_bill, axis = 1))

#sex     smoker
#Female  No        0.153189
#        Yes       0.163062
#Male    No        0.157312
#        Yes       0.136919
#dtype: float64

这篇关于 pandas 分组 - 基于另一列的分组总计的百分比值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 分组 - 基于另一列的分组总计的百分比值 [英] Pandas Grouping - Values as Percent of Grouped Totals Based on Another Column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 分组 - 基于另一列的分组总计的百分比值 [英] Pandas Grouping - Values as Percent of Grouped Totals Based on Another Column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭