删除少于三个唯一观察值的组 [英] Remove groups with less than three unique observations

查看:63
本文介绍了删除少于三个唯一观察值的组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对数据框进行子集化,以仅保留在不同日期具有3个或更多观察值的组.我想摆脱观察少于3个的组,或者他们不是来自3天不同的观察组.

I would like to subset my data frame to keep only groups that have 3 or more observations on DIFFERENT days. I want to get rid of groups that have less than 3 observations, or the observations they have are not from 3 different days.

这是一个示例数据集:

Group   Day
1       1 
1       3
1       5
1       5
2       2
2       2  
2       4 
2       4
3       1
3       2
3       3
4       1
4       5

因此对于上面的示例,将保留组1和组3,并从数据帧中删除组2和4.

So for the above example, group 1 and group 3 will be kept and group 2 and 4 will be removed from the data frame.

我希望这是有道理的,我想解决方案将非常简单,但我无法解决(我对R还是很陌生,并且对此类问题的解决方案不太快).我认为diff函数可能会派上用场,但并没有进一步发展.

I hope this makes sense, I imagine the solution will be quite simple but I can't work it out (I'm quite new to R and not very fast at coming up with solutions to things like this). I thought maybe the diff function could come in handy but didn't get much further.

推荐答案

带有您可以做到:

library(data.table)
DT[, if(uniqueN(Day) >= 3) .SD, by = Group]

给出:

   Group Day
1:     1   1
2:     1   3
3:     1   5
4:     1   5
5:     3   1
6:     3   2
7:     3   3

或使用dplyr:

library(dplyr)
DT %>% 
  group_by(Group) %>% 
  filter(n_distinct(Day) >= 3)

给出相同的结果.

这篇关于删除少于三个唯一观察值的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆