如何有条件地将观察分为几组? [英] How to conditionally partition observations into groups?

查看:78
本文介绍了如何有条件地将观察分为几组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下输入内容:

C1  C2
1   1
1   1
1   2
1   3
1   4
2   1
.   .

C1和C2是组,其中C2是C1中的嵌套组。现在,我想在C1上建立最小为2的子组。虽然不应该拆分C2中的组,但我希望有尽可能多的组。手动地,我将首先查看组C1,并将子组2、3和4一起加入(G = 1),然后将子组1(C2 = 1)作为组(G = 2)。预期的输出为(其中G是我尝试创建的组)

C1 and C2 are groups, where C2 is a nested group within C1. Now I'd like to build subgroups on C1 having a minimum size of 2. While the groups in C2 should not be split, I'd like to have as many groups as possible. Manually, I would first have a look at the group C1 and join subgroups 2, 3 and 4 together to (G=1) and take the subgroup 1 (C2=1) as a group (G=2). The expected output would be (where G are the groups I try to create)

C1  C2  G
1   1   1
1   1   1
1   2   2
1   3   2
1   4   2
2   1   3
.   .   .

我希望我的意思很清楚。

I hope it's clear what I mean. Any help is highly appreciated.

推荐答案

使用:

library(data.table)
setDT(mydf)[, G := {r <- rep(1:floor(.N/2), each = 2); if(length(r) != .N) c(r, tail(r,1)) else r}
            , by = C1
            ][, G := rleid(G)][]

您将获得:


    C1 C2 G
 1:  1  1 1
 2:  1  1 1
 3:  1  2 2
 4:  1  3 2
 5:  1  4 2
 6:  2  1 3
 7:  2  1 3
 8:  2  2 4
 9:  2  3 4
10:  2  4 4
11:  3  1 5
12:  3  2 5
13:  3  3 6
14:  3  4 6
15:  3  5 6







已使用数据:


Used data:

mydf <- structure(list(C1 = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), 
                       C2 = c(1L, 1L, 2L, 3L, 4L, 1L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L)), 
                  .Names = c("C1", "C2"), class = "data.frame", row.names = c(NA, -15L))

这篇关于如何有条件地将观察分为几组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆