计算大于pandas groupby中的值的项目 [英] Count items greater than a value in pandas groupby

查看:93
本文介绍了计算大于pandas groupby中的值的项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有Yelp数据集,我想计算所有大于3星的评论.通过这样做,我得到了评论数:

I have the Yelp dataset and I want to count all reviews which have greater than 3 stars. I get the count of reviews by doing this:

reviews.groupby('business_id')['stars'].count()

现在,我想获得超过3星的评论数量,因此我尝试从

Now I want to get the count of reviews which had more than 3 stars, so I tried this by taking inspiration from here:

reviews.groupby('business_id')['stars'].agg({'greater':lambda val: (val > 3).count()})

但是,这只不过给了我像以前一样的所有恒星数.我不确定这是否是正确的方法?我在这里做错了什么. lambda表达式是否不通过stars列的每个值?

But this just gives me the count of all stars like before. I am not sure if this is the right way to do it? What am I doing incorrectly here. Does the lambda expression not go through each value of the stars column?

好吧,我觉得很蠢.我应该使用sum函数而不是count来获得大于3的元素的值,就像这样:

Okay I feel stupid. I should have used the sum function instead of count to get the value of elements greater than 3, like this:

reviews.groupby('business_id')['stars'].agg({'greater':lambda val: (val > 3).sum()})

推荐答案

您可以尝试做:

reviews[reviews['stars'] > 3].groupby('business_id')['stars'].count()

这篇关于计算大于pandas groupby中的值的项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆