计算记录数,并在每个组中生成行号 [英] Count number of records and generate row number within each group
问题描述
我有以下data.table
set.seed(1)
DT < - data。表(VAL = sample(c(1,2,3),10,replace = TRUE))
VAL
1:1
2:2
3:2
4:3
5:1
6:3
7:3
8:2
9:2
10:1 $ b $ VAL
中的 >我想:
- 计算记录/行数
- 创建行索引计数器)的第一,第二,第三发生等。
结束时我想要结果
VAL COUNT IDX
1:1 3 1
2:2 4 1
3:2 4 2
4:3 3 1
5:1 3 2
6:3 3 2
7:3 3 3
8:2 4 3
9:2 4 4
10:1 3 3
其中COUNT是每个VAL的记录/行数,IDX索引在每个VAL内。
我尝试使用其中
和 length
.I
:
dt [,list(COUNT = length == VAL [.I]),
IDX = which(which(VAL == VAL [.I])== .I))]
但这不工作,因为 .I
指的是带索引的向量,所以我想必须使用 .I []
。虽然在 .I []
我再次面临的问题,我没有行索引,我知道(从阅读 data.table
常见问题和这里的帖子),尽可能避免循环遍历行。
那么, data.table
方法是什么?
解决方案使用 .N
...
DT [,`:=`(COUNT = .N,IDX = 1:.N),by = VAL]
#VAL COUNT IDX
#1:1 3 1
#2:2 4 1
#3:2 4 2
#4:3 3 1
#5:1 3 2
#6:3 3 2
#7:3 3 3
#8:2 4 3
#9:2 4 4
#10:1 3 3
.N
在每个组中,由VAL
定义的组。
I have the following data.table
set.seed(1)
DT <- data.table(VAL = sample(c(1, 2, 3), 10, replace = TRUE))
VAL
1: 1
2: 2
3: 2
4: 3
5: 1
6: 3
7: 3
8: 2
9: 2
10: 1
Within each number in VAL
I want to:
- Count the number of records/rows
- Create an row index (counter) of first, second, third occurrence et c.
At the end I want the result
VAL COUNT IDX
1: 1 3 1
2: 2 4 1
3: 2 4 2
4: 3 3 1
5: 1 3 2
6: 3 3 2
7: 3 3 3
8: 2 4 3
9: 2 4 4
10: 1 3 3
where "COUNT" is the number of records/rows for each "VAL", and "IDX" is the row index within each "VAL".
I tried to work with which
and length
using .I
:
dt[, list(COUNT = length(VAL == VAL[.I]),
IDX = which(which(VAL == VAL[.I]) == .I))]
but this does not work as .I
refers to a vector with the index, so I guess one must use .I[]
. Though inside .I[]
I again face the problem, that I do not have the row index and I do know (from reading data.table
FAQ and following the posts here) that looping through rows should be avoided if possible.
So, what's the data.table
way?
解决方案 Using .N
...
DT[ , `:=`( COUNT = .N , IDX = 1:.N ) , by = VAL ]
# VAL COUNT IDX
# 1: 1 3 1
# 2: 2 4 1
# 3: 2 4 2
# 4: 3 3 1
# 5: 1 3 2
# 6: 3 3 2
# 7: 3 3 3
# 8: 2 4 3
# 9: 2 4 4
#10: 1 3 3
.N
is the number of records in each group, with groups defined by "VAL"
.
这篇关于计算记录数,并在每个组中生成行号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!