在Pandas groupby对象中获取比率 [英] Getting a ratio in Pandas groupby object
问题描述
我有一个看起来像这样的数据框:
I have a dataframe that looks like this:
我想为每个状态创建另一列"engaged_percent",这基本上是唯一的engaged_count数除以每个特定状态的user_count数.
I want to create another column called "engaged_percent" for each state which is basically the number of unique engaged_count divided by the user_count of each particular state.
我尝试执行以下操作:
def f(x):
engaged_percent = x['engaged_count'].nunique()/x['user_count']
return pd.Series({'engaged_percent': engaged_percent})
by = df3.groupby(['user_state']).apply(f)
by
但是它给了我以下结果:
But it gave me the following result:
我想要的是这样的:
user_state engaged_percent
---------------------------------
California 2/21 = 0.09
Florida 2/7 = 0.28
我认为我的方法是正确的,但是我不确定为什么我的结果会像第二张图所示那样出现.
I think my approach is correct , however I am not sure why my result shows up like the one seen in the second picture.
任何帮助将不胜感激!预先感谢!
Any help would be much appreciated! Thanks in advance!
推荐答案
怎么样:
user_count=df3.groupby('user_state')['user_count'].mean()
#(or however you think a value for each state should be calculated)
engaged_unique=df3.groupby('user_state')['engaged_count'].nunique()
engaged_pct=engaged_unique/user_count
(您也可以通过多种方式在一行中完成此操作)
(you could also do this in one line in a bunch of different ways)
您最初的解决方案几乎没问题,只不过您是将值除以整个user count
系列.因此,您获得的是系列而不是值.您可以尝试这种细微的变化:
Your original solution was almost fine except that you were dividing a value by the entire user count
series. So you were getting a Series instead of a value. You could try this slight variation:
def f(x):
engaged_percent = x['engaged_count'].nunique()/x['user_count'].mean()
return engaged_percent
by = df3.groupby(['user_state']).apply(f)
by
这篇关于在Pandas groupby对象中获取比率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!