pandas ,groupby,其中列值大于x [英] Pandas, groupby where column value is greater than x
问题描述
我有一个这样的桌子
timestamp avg_hr hr_quality avg_rr rr_quality activity sleep_summary_id
1422404668 66 229 0 0 13 78
1422404670 64 223 0 0 20 78
1422404672 64 216 0 0 11 78
1422404674 66 198 0 40 9 78
1422404676 65 184 0 30 3 78
1422404678 64 173 0 10 17 78
1422404680 66 199 0 20 118 78
我正在尝试按timestamp
,sleep id
和rr_quality
分组数据,其中rr_quality
是> 0
I'm trying to group the data by timestamp
,sleep id
and rr_quality
, where rr_quality
is > 0
我尝试了以下方法,但似乎都没有作用
I've tried the following and none of them seems to work
df3 = df2.groupby([df2.index.hour,'sleep_summary_id',df2['rr_quality']>0])
df3 = df2.groupby([df2.index.hour,'sleep_summary_id','rr_quality'>0])
df3 = df2.groupby([df2.index.hour,'sleep_summary_id',['rr_quality']>0])
所有这些都返回键错误.
All of them returns a keyerror.
似乎也无法一次通过多个过滤器. 我尝试了以下方法:
Also can't seem to be able to pass more than one filter at a time. I tried the following:
df2[df2['rr_quality'] >= 150, df2['hr_quality'] > 200]
df2[df2['rr_quality'] >= 150, ['hr_quality'] > 200]
df2[[df2['rr_quality'] >= 150, ['hr_quality'] > 200]]
返回:TypeError: 'Series' objects are mutable, thus they cannot be hashed
推荐答案
最简单的方法是先过滤df,然后执行groupby:
the simplest thing to do here is to filter the df first and then perform the groupby:
df2[df2['rr_quality'] > 0].groupby([df2.index.hour,'sleep_summary_id'])
编辑
如果您打算将其分配回原始df:
If you're intending to assign this back to your original df:
df2.loc[df2['rr_quality'] > 0, 'AVG_HR'] = df2[df2['rr_quality'] >= 150].groupby([df2.index.hour,'emfit_sleep_summary_id'])['avg_hr'].transform('mean')
loc
调用将屏蔽lhs,以便正确对齐转换结果
The loc
call will mask the lhs so that the result of the transform aligns correctly
要使用多个条件进行过滤,您需要分别对and
,or
和not
使用数组比较运算符&
,|
和~
,此外,您需要将条件包装在括号中由于运算符优先级:
To filter using multiple conditions you need to use the array comparision operators &
, |
and ~
for and
, or
and not
respectively, additionally you need to wrap the conditions in parentheses due to operator precedence:
df2[(df2['rr_quality'] >= 150) & (df2['hr_quality'] > 200)]
这篇关于 pandas ,groupby,其中列值大于x的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!