根据来自不同列的值的总和过滤行组 [英] Filter group of rows based on sum of values from different column

查看:25
本文介绍了根据来自不同列的值的总和过滤行组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图过滤掉 R 中的整行,但前提是特定集合的频率加起来不超过 5.

I'm trying to filter out whole rows in R, but only if the frequencies for a particular set don't add up to more than 5.

我的数据看起来有点像这样.这是我目前称之为Words"的数据框:

The data I have looks a bit like this. It's a dataframe that I'm currently calling "Words":

HEADWORD VARIANT FREQUENCY
 SWORD    sword      2
 SWORD    swerd      1
 SWORD    sworde     1
 KNIGHT   knight     6
 KNIGHT   kniht      2
 KNIGHT   knyt       1

我只想要特定词条中频率加起来超过 5 的行.所以在这里,我想保留 KNIGHT 的所有实例,但我想完全摆脱所有 SWORD 行.

I only want rows for which the frequencies within a particular headword add up to more than 5. So here, I want to keep all the instances of KNIGHT but I want to get rid of all the SWORD rows entirely.

我尝试在 dplyr 上执行此操作,但没有成功.这是我试过的代码:

I tried to do this on dplyr, but with no success. This is the code I tried:

Words1 %>% group_by(HW) %>%  filter(Fr > 5)

推荐答案

我们需要得到'FREQUENCY'的sum并在filter中检查它是否大于5代码>按'HEADWORD'分组后

We need to get the sum of 'FREQUENCY' and check whether it is greater than 5 in the filter after grouping by 'HEADWORD'

Words1 %>% 
     group_by(HEADWORD) %>% 
     filter(sum(FREQUENCY) >5)   
#   HEADWORD VARIANT FREQUENCY
#     <chr>   <chr>     <int>
#1   KNIGHT  knight         6
#2   KNIGHT   kniht         2 
#3   KNIGHT    knyt         1

这篇关于根据来自不同列的值的总和过滤行组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆