根据不同列的值之和过滤行组 [英] Filter group of rows based on sum of values from different column
问题描述
我试图过滤掉R中的整行,但前提是特定集合的频率加起来不超过5。
我所看到的数据有点像这样。这是我当前正在调用的数据数据框:
关键字变体频率
Sword剑2
SWORD 1
剑剑1
骑士骑士6
骑士骑士2
骑士骑士1
我只希望特定单词内的频率加起来大于5的行。因此,在这里,我想保留KNIGHT的所有实例,但我想完全摆脱所有SWORD行。
我尝试在dplyr上执行此操作,但没有成功。这是我尝试的代码:
Words1%>%group_by(HW)%&%;%filter(Fr> 5)
我们需要获取的总和
的 FREQUENCY,并在按 HEADWORD分组后检查过滤器
是否大于5。
Words1%>%
group_by(HEADWORD)%&%;%
filter(sum(FREQUENCY)> 5)
#HEADWORD变量频率
#< chr> < chr> < int>
#1骑士6
#2骑士2
#3骑士1
I'm trying to filter out whole rows in R, but only if the frequencies for a particular set don't add up to more than 5.
The data I have looks a bit like this. It's a dataframe that I'm currently calling "Words":
HEADWORD VARIANT FREQUENCY
SWORD sword 2
SWORD swerd 1
SWORD sworde 1
KNIGHT knight 6
KNIGHT kniht 2
KNIGHT knyt 1
I only want rows for which the frequencies within a particular headword add up to more than 5. So here, I want to keep all the instances of KNIGHT but I want to get rid of all the SWORD rows entirely.
I tried to do this on dplyr, but with no success. This is the code I tried:
Words1 %>% group_by(HW) %>% filter(Fr > 5)
We need to get the sum
of 'FREQUENCY' and check whether it is greater than 5 in the filter
after grouping by 'HEADWORD'
Words1 %>%
group_by(HEADWORD) %>%
filter(sum(FREQUENCY) >5)
# HEADWORD VARIANT FREQUENCY
# <chr> <chr> <int>
#1 KNIGHT knight 6
#2 KNIGHT kniht 2
#3 KNIGHT knyt 1
这篇关于根据不同列的值之和过滤行组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!