如何获得groupby大小的百分比 [英] how to get percentage for groupby size

查看:73
本文介绍了如何获得groupby大小的百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种获取百分比的方法

I am looking for a way to get percentages

df.groupby(['state', 'approved_or_not']).size()

Output:

school_state  project_is_approved
AK            0                         55
              1                        290
AL            0                        256
              1                       1506
AR            0                        177
              1                        872
AZ            0                        347
              1                       1800

这很好,但我想要的是百分比而不是计数.

which is good but what I want is percentages instead of counts.

school_state  project_is_approved
AK            0                        0.16
              1                        0.84
AL            0                        0.14
              1                        0.86

我尝试过,但找不到办法.感谢有人可以提供帮助吗?

I tried and couldn't figure out a way. Appreciate if someone can help?

推荐答案

使用 SeriesGroupBy.value_counts 带参数 normalize=True:

df.groupby('state')['approved_or_not'].value_counts(normalize=True)

示例:

np.random.seed(2019)

L = list('ABC')
df = pd.DataFrame({'state':np.random.choice(L, size=10),
                   'approved_or_not':np.random.choice([0,1], size=10)})
print (df)
  state  approved_or_not
0     A                0
1     C                0
2     B                1
3     A                0
4     C                1
5     C                1
6     A                0
7     B                0
8     A                0
9     C                1

<小时>

a = df.groupby(['state', 'approved_or_not']).size()
print (a)
A      0                  4
B      0                  1
       1                  1
C      0                  1
       1                  3
dtype: int64

a = df.groupby('state')['approved_or_not'].value_counts(normalize=True)
print (a)
state  approved_or_not
A      0                  1.00
B      0                  0.50
       1                  0.50
C      1                  0.75
       0                  0.25
Name: approved_or_not, dtype: float64

您可以除以 Series.div with sum 每个第一级state:

You can divide by Series.div with sum per first level state:

a = df.groupby(['state', 'approved_or_not']).size()

a = a.div(a.sum(level=0), level=0)
print (a)
state  approved_or_not
A      0                  1.00
B      0                  0.50
       1                  0.50
C      0                  0.25
       1                  0.75
dtype: float64

这篇关于如何获得groupby大小的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆