每对的特殊分组号 [英] Special grouping number for each pairs

查看：103 发布时间：2020/6/2 20:33:40 r dplyr aggregate

本文介绍了每对的特殊分组号的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这里已经回答了部分问题 special-group每个数据组合的数量。在大多数情况下，我们在数据内部具有对和其他数据值。我们要实现的是，如果存在这些对，则对那些组进行编号，并将它们编号直到下一对。

There is already some part of the question answered here special-group-number-for-each-combination-of-data. In most cases we have pairs and other data values inside the data. What we want to achieve is that number those groups if those pairs exist and number them until the next pairs.

当我集中每对时，例如 c（ bad， good）希望将它们分组并配对 c（'Veni'， vidi， Vici）分配唯一编号 666 。

As I concentrated each pairs such as c("bad","good") would like to group them and for pairs c('Veni',"vidi","Vici") assign unique number 666.

以下是示例数据

names <- c(c("bad","good"),1,2,c("good","bad"),111,c("bad","J.James"),c("good","J.James"),333,c("J.James","good"),761,'Veni',"vidi","Vici")

  df <- data.frame(names)

这是真实情况和一般情况下的预期输出

Here is the real and general case expected output

     names  Group
1      bad    1
2     good    1
3        1    1
4        2    1
5     good    2
6      bad    2
7      111    2
8      bad    3
9  J.James    3
10    good    4
11 J.James    4
12     333    4
13 J.James    5
14    good    5
15     761    5
16    Veni    666
17    vidi    666
18    Vici    666

推荐答案

这里有两种方法可以为给定的样本数据集重现OP的预期结果。

Here are two approaches which reproduce OP's expected result for the given sample dataset.`

两者的工作方式相同。首先，将跳过所有令人烦扰的行，即不包含有效名称的行，并以2组为单位对具有有效名称的行进行简单编号。其次，为具有免除名称的行指定特殊的组数。最后， NA 行通过进行最后一个观察来填充。

Both work in the same way. First, all "disturbing" rows, i.e., rows which do not contain "valid" names, are skipped and the rows with "valid" names are simply numbered in groups of 2. Second, the rows with exempt names are given the special group number. Finally, the NA rows are filled by carrying the last observation forward.

library(data.table)
names <- c(c("bad","good"),1,2,c("good","bad"),111,c("bad","J.James"),c("good","J.James"),333,c("J.James","good"),761,'Veni',"vidi","Vici")
exempt <- c("Veni", "vidi", "Vici")
data.table(names)[is.na(as.numeric(names)) & !names %in% exempt, 
                  grp := rep(1:.N, each = 2L, length.out = .N)][
                    names %in% exempt, grp := 666L][
                      , grp := zoo::na.locf(grp)][]

      names grp
 1:     bad   1
 2:    good   1
 3:       1   1
 4:       2   1
 5:    good   2
 6:     bad   2
 7:     111   2
 8:     bad   3
 9: J.James   3
10:    good   4
11: J.James   4
12:     333   4
13: J.James   5
14:    good   5
15:     761   5
16:    Veni 666
17:    vidi 666
18:    Vici 666

`dplyr` / `tidyr`

我尝试提供 dplyr / tidyr 解决方案：

`dplyr`/`tidyr`

Here is my attempt to provide a dplyr/tidyr solution:

library(dplyr)
as_tibble(names) %>% 
  mutate(grp = if_else(is.na(as.numeric(names)) & !names %in% exempt,  
                       rep(1:n(), each = 2L, length.out = n()),
                       if_else(names %in% exempt, 666L, NA_integer_))) %>% 
  tidyr::fill(grp)

# A tibble: 18 x 2
   value     grp
   <chr>   <int>
 1 bad         1
 2 good        1
 3 1           1
 4 2           1
 5 good        3
 6 bad         3
 7 111         3
 8 bad         4
 9 J.James     5
10 good        5
11 J.James     6
12 333         6
13 J.James     7
14 good        7
15 761         7
16 Veni      666
17 vidi      666
18 Vici      666

这篇关于每对的特殊分组号的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

每对的特殊分组号 [英] Special grouping number for each pairs

问题描述

推荐答案

`dplyr` / `tidyr`

`dplyr`/`tidyr`

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

每对的特殊分组号 [英] Special grouping number for each pairs

问题描述

推荐答案

dplyr / tidyr

dplyr/tidyr

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

`dplyr` / `tidyr`

`dplyr`/`tidyr`

登录关闭