pandas ，groupby，其中列值大于x [英] Pandas, groupby where column value is greater than x

查看：41 发布时间：2020/5/24 0:59:28 python pandas

本文介绍了 pandas ，groupby，其中列值大于x的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个这样的桌子

    timestamp   avg_hr  hr_quality  avg_rr  rr_quality  activity    sleep_summary_id

    1422404668  66      229             0       0           13              78
    1422404670  64      223             0       0           20              78
    1422404672  64      216             0       0           11              78
    1422404674  66      198             0       40          9               78
    1422404676  65      184             0       30          3               78
    1422404678  64      173             0       10          17              78
    1422404680  66      199             0       20          118             78

我正在尝试按timestamp，sleep id和rr_quality分组数据，其中rr_quality是> 0

I'm trying to group the data by timestamp,sleep id and rr_quality, where rr_quality is > 0

我尝试了以下方法，但似乎都没有作用

I've tried the following and none of them seems to work

 df3 = df2.groupby([df2.index.hour,'sleep_summary_id',df2['rr_quality']>0])

 df3 = df2.groupby([df2.index.hour,'sleep_summary_id','rr_quality'>0])

 df3 = df2.groupby([df2.index.hour,'sleep_summary_id',['rr_quality']>0])

所有这些都返回键错误.

All of them returns a keyerror.

似乎也无法一次通过多个过滤器. 我尝试了以下方法:

Also can't seem to be able to pass more than one filter at a time. I tried the following:

df2[df2['rr_quality'] >= 150, df2['hr_quality'] > 200]
df2[df2['rr_quality'] >= 150, ['hr_quality'] > 200]
df2[[df2['rr_quality'] >= 150, ['hr_quality'] > 200]]

返回:TypeError: 'Series' objects are mutable, thus they cannot be hashed

推荐答案

最简单的方法是先过滤df，然后执行groupby:

the simplest thing to do here is to filter the df first and then perform the groupby:

df2[df2['rr_quality'] > 0].groupby([df2.index.hour,'sleep_summary_id'])

编辑

如果您打算将其分配回原始df:

If you're intending to assign this back to your original df:

df2.loc[df2['rr_quality'] > 0, 'AVG_HR'] = df2[df2['rr_quality'] >= 150].groupby([df2.index.hour,'emfit_sleep_summary_id'])['avg_hr'].transform('mea‌n')

loc调用将屏蔽lhs，以便正确对齐转换结果

The loc call will mask the lhs so that the result of the transform aligns correctly

要使用多个条件进行过滤，您需要分别对and，or和not使用数组比较运算符&，|和~，此外，您需要将条件包装在括号中由于运算符优先级:

To filter using multiple conditions you need to use the array comparision operators &, | and ~ for and, or and not respectively, additionally you need to wrap the conditions in parentheses due to operator precedence:

df2[(df2['rr_quality'] >= 150) & (df2['hr_quality'] > 200)]

这篇关于 pandas ，groupby，其中列值大于x的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas ，groupby，其中列值大于x [英] Pandas, groupby where column value is greater than x

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas ，groupby，其中列值大于x [英] Pandas, groupby where column value is greater than x

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭