data.table “关键索引"或“组计数器" [英] data.table "key indices" or "group counter"
问题描述
在 data.table 上创建键后:
After creating a key on a data.table:
set.seed(12345)
DT <- data.table(x = sample(LETTERS[1:3], 10, replace = TRUE),
y = sample(LETTERS[1:3], 10, replace = TRUE))
setkey(DT, x, y)
DT
# x y
# [1,] A B
# [2,] A B
# [3,] B B
# [4,] B B
# [5,] C A
# [6,] C A
# [7,] C A
# [8,] C A
# [9,] C C
# [10,] C C
我想获得一个整数向量,为每一行提供相应的键索引".我希望下面的预期输出(i
列)将有助于澄清我的意思:
I would like to get an integer vector giving for each row the corresponding "key index". I hope the expected output (column i
) below will help clarify what I mean:
# x y i
# [1,] A B 1
# [2,] A B 1
# [3,] B B 2
# [4,] B B 2
# [5,] C A 3
# [6,] C A 3
# [7,] C A 3
# [8,] C A 3
# [9,] C C 4
# [10,] C C 4
我考虑过使用类似 cumsum(!duplicated(DT[, key(DT), with = FALSE]))
的东西,但我希望有更好的解决方案.我觉得这个向量可能是表格内部表示的一部分,也许有办法访问它?即使不是这样,你有什么建议?
I thought about using something like cumsum(!duplicated(DT[, key(DT), with = FALSE]))
but am hoping there is a better solution. I feel this vector could be part of the table's internal representation, and maybe there is a way to access it? Even if it is not the case, what would you suggest?
推荐答案
更新:从v1.8.3
,你可以简单地使用内置的特殊.GRP
:p>
Update: From v1.8.3
, you can simply use the inbuilt special .GRP
:
DT[ , i := .GRP, by = key(DT)]
查看历史以获取较早的答案.
See history for older answers.
这篇关于data.table “关键索引"或“组计数器"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!