通过安排两个变量来添加计数器列(dplyr) [英] add counter column by arranging two variables (dplyr)

查看:45
本文介绍了通过安排两个变量来添加计数器列(dplyr)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在这里和那里寻找了一段时间,但是找不到适合我的情况的解决方案.我有一个ID和VAR混合在一起的数据框.在下面,我试图复制一个样本

I've been looking for a while here and there but I couldn't find any solution for my situation. I have a data frame with IDs and VAR mixed within it. Here below I tried to reproduced a sample

require(dplyr)
seed(123)
N <- 3
T <- 4
id <- rep(letters[1:N], each = T) 
var <- rep(sample(seq(1:100),T),N) 
row <- sample(seq(1:(N*T)),replace = F)

dt <- data.frame(ID=id,VAR=var,ROW=row) %>%
  arrange(ROW) %>%
  select(-ROW)

我想按ID和VAR arrange 并在每个组中添加一个计数器,以获得类似的东西

and I'd like to arrange by ID and VAR and add a counter per group in order to get something like

   ID VAR COUNTER
1   a   1 1
2   a  11 2
3   a  22 3
4   a  64 4
5   b   1 1
6   b  11 2
7   b  22 3
8   b  64 4
9   c   1 1
10  c  11 2
11  c  22 3
12  c  64 4

所有这些,如果可能的话,只需使用dplyr或基本函数即可.

all of this, if it is possible, just by using dplyr or base functions.

推荐答案

dplyr 内,您需要通过 ID VAR ,然后 group_by()只是 ID .

Within dplyr you need to arrange() by ID and VAR and then group_by() just ID.

然后,您使用 mutate()添加一个新列,从1到 n()(其中 n()是一个dplyr函数用于行数")

Then you use mutate() to add a new column, counting from 1 to n() (where n() is a dplyr function for 'number of rows')

set.seed(123)
dt %>%
    arrange(ID, VAR) %>%
    group_by(ID) %>%
    mutate(COUNTER = 1:n()) %>%  ## as per comment, can use row_number()
    ungroup()

# # A tibble: 12 × 3
#         ID   VAR COUNTER
#     <fctr> <int>   <int>
# 1       a    29       1
# 2       a    41       2
# 3       a    79       3
# 4       a    86       4
# 5       b    29       1
# 6       b    41       2
# 7       b    79       3
# 8       b    86       4
# 9       c    29       1
# 10      c    41       2
# 11      c    79       3
# 12      c    86       4


关于取消分组的评论

我这样做是为了删除与 grouped_df 相关的所有分组"属性.在此示例中,结果是相同的,但是那些分组的属性可能会进一步困扰您.


A comment on ungrouping

I do this to remove all the 'grouping' attributes associated with a grouped_df. In this example the result is the same, but those grouped attributes may bite you further down the line.

dt_grouped <- dt %>%
    arrange(ID, VAR) %>%
    group_by(ID) %>%
    mutate(COUNTER = 1:n()) 

dt_ungrouped <- dt %>%
    arrange(ID, VAR) %>%
    group_by(ID) %>%
    mutate(COUNTER = 1:n()) %>%
    ungroup()

str(dt_grouped)
# Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':   12 obs. of  3 variables:
#   $ ID     : Factor w/ 3 levels "a","b","c": 1 1 1 1 2 2 2 2 3 3 ...
# $ VAR    : int  29 41 79 86 29 41 79 86 29 41 ...
# $ COUNTER: int  1 2 3 4 1 2 3 4 1 2 ...
# - attr(*, "vars")=List of 1
# ..$ : symbol ID
# - attr(*, "labels")='data.frame': 3 obs. of  1 variable:
#   ..$ ID: Factor w/ 3 levels "a","b","c": 1 2 3
# ..- attr(*, "vars")=List of 1
# .. ..$ : symbol ID
# ..- attr(*, "drop")= logi TRUE
# - attr(*, "indices")=List of 3
# ..$ : int  0 1 2 3
# ..$ : int  4 5 6 7
# ..$ : int  8 9 10 11
# - attr(*, "drop")= logi TRUE
# - attr(*, "group_sizes")= int  4 4 4
# - attr(*, "biggest_group_size")= int 4

str(dt_ungrouped)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 12 obs. of  3 variables:
#   $ ID     : Factor w/ 3 levels "a","b","c": 1 1 1 1 2 2 2 2 3 3 ...
# $ VAR    : int  29 41 79 86 29 41 79 86 29 41 ...
# $ COUNTER: int  1 2 3 4 1 2 3 4 1 2 ...

这篇关于通过安排两个变量来添加计数器列(dplyr)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆