pandas :按符合条件的列分组 [英] Pandas: Group by a column that meets a condition

查看:92
本文介绍了 pandas :按符合条件的列分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含三个列的数据集:等级,品种和狗。

 将熊猫作为pd $ b $导入b条狗= {'品种':['奇瓦瓦狗','奇瓦瓦狗','达尔马提亚狗','斯芬克斯'],
'狗':[真,真,真,假],
'评级':[8.0、9.0、10.0、7.0]}

df = pd.DataFrame(data = dogs)

我想计算狗为True时每个品种的平均值评分。这将是预期的结果。

这是我的尝试:

  df.groupby('breed')['rating']。mean()。where(dog == True)

这是我得到的错误:

  NameError:名称 dog不是定义的

但是当我尝试添加其中条件我只会得到错误。谁能提供解决方案? TIA

解决方案

分组后,选择一列,您的列在您选择的上下文中已不存在(即使您没有正确访问它也是如此)。



先过滤数据框,然后然后使用 groupby 平均值

  df [df.dog] .groupby('breed')['rating']。mean()。reset_index()

品种评级
0奇瓦瓦州8.5
1达尔马提亚10.0


I have a data set with three colums: rating , breed, and dog.

import pandas as pd
dogs = {'breed': ['Chihuahua', 'Chihuahua', 'Dalmatian', 'Sphynx'],
        'dog': [True, True, True, False],
        'rating': [8.0, 9.0, 10.0, 7.0]}

df = pd.DataFrame(data=dogs)

I would like to calculate the mean rating per breed where dog is True. This would be the expected:

  breed     rating
0 Chihuahua 8.5   
1 Dalmatian 10.0  

This has been my attempt:

df.groupby('breed')['rating'].mean().where(dog == True)

And this is the error that I get:

NameError: name 'dog' is not defined

But when I try add the where condition I only get errors. Can anyone advise a solution? TIA

解决方案

Once you groupby and select a column, your dog column doesn't exist anymore in the context you have selected (and even if it did you are not accessing it correctly).

Filter your dataframe first, then use groupby with mean

df[df.dog].groupby('breed')['rating'].mean().reset_index()

       breed  rating
0  Chihuahua     8.5
1  Dalmatian    10.0

这篇关于 pandas :按符合条件的列分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆