计算大于pandas groupby中的值的项目 [英] Count items greater than a value in pandas groupby
问题描述
我有Yelp数据集,我想计算所有大于3星的评论.通过这样做,我得到了评论数:
I have the Yelp dataset and I want to count all reviews which have greater than 3 stars. I get the count of reviews by doing this:
reviews.groupby('business_id')['stars'].count()
Now I want to get the count of reviews which had more than 3 stars, so I tried this by taking inspiration from here:
reviews.groupby('business_id')['stars'].agg({'greater':lambda val: (val > 3).count()})
但是,这只不过给了我像以前一样的所有恒星数.我不确定这是否是正确的方法?我在这里做错了什么. lambda表达式是否不通过stars列的每个值?
But this just gives me the count of all stars like before. I am not sure if this is the right way to do it? What am I doing incorrectly here. Does the lambda expression not go through each value of the stars column?
好吧,我觉得很蠢.我应该使用sum函数而不是count来获得大于3的元素的值,就像这样:
Okay I feel stupid. I should have used the sum function instead of count to get the value of elements greater than 3, like this:
reviews.groupby('business_id')['stars'].agg({'greater':lambda val: (val > 3).sum()})
推荐答案
您可以尝试做:
reviews[reviews['stars'] > 3].groupby('business_id')['stars'].count()
这篇关于计算大于pandas groupby中的值的项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!