pandas :按符合条件的列分组 [英] Pandas: Group by a column that meets a condition
问题描述
我有一个包含三个列的数据集:等级,品种和狗。
将熊猫作为pd $ b $导入b条狗= {'品种':['奇瓦瓦狗','奇瓦瓦狗','达尔马提亚狗','斯芬克斯'],
'狗':[真,真,真,假],
'评级':[8.0、9.0、10.0、7.0]}
df = pd.DataFrame(data = dogs)
我想计算狗为True时每个品种的平均值评分。这将是预期的结果。
这是我的尝试:
df.groupby('breed')['rating']。mean()。where(dog == True)
这是我得到的错误:
NameError:名称 dog不是定义的
但是当我尝试添加其中
条件我只会得到错误。谁能提供解决方案? TIA
分组后,选择一列,您的狗
列在您选择的上下文中已不存在(即使您没有正确访问它也是如此)。
先过滤数据框,然后然后使用 groupby
和 平均值
df [df.dog] .groupby('breed')['rating']。mean()。reset_index()
品种评级
0奇瓦瓦州8.5
1达尔马提亚10.0
I have a data set with three colums: rating , breed, and dog.
import pandas as pd
dogs = {'breed': ['Chihuahua', 'Chihuahua', 'Dalmatian', 'Sphynx'],
'dog': [True, True, True, False],
'rating': [8.0, 9.0, 10.0, 7.0]}
df = pd.DataFrame(data=dogs)
I would like to calculate the mean rating per breed where dog is True. This would be the expected:
breed rating
0 Chihuahua 8.5
1 Dalmatian 10.0
This has been my attempt:
df.groupby('breed')['rating'].mean().where(dog == True)
And this is the error that I get:
NameError: name 'dog' is not defined
But when I try add the where
condition I only get errors. Can anyone advise a solution? TIA
Once you groupby and select a column, your dog
column doesn't exist anymore in the context you have selected (and even if it did you are not accessing it correctly).
Filter your dataframe first, then use groupby
with mean
df[df.dog].groupby('breed')['rating'].mean().reset_index()
breed rating
0 Chihuahua 8.5
1 Dalmatian 10.0
这篇关于 pandas :按符合条件的列分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!