填写data.table中的索引列 [英] Filling up index column in data.table

查看:126
本文介绍了填写data.table中的索引列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题与以下相关:向索引组中出现次数的数据框添加列
我有以下data.table按前2列排序。

  ddt = structure(list(Unit = structure(c(1L,1L,2L,2L,3L,3L),.Label = c A,
A1,B),class =factor),Anything = c(3.4,6.9,1.1,2.2,
2,3),index = c 0,0,0,0,0)),.Names = c(Unit,Anything,
index),row.names = c(NA,-6L),class = c data.table,data.frame
),.internal.selfref =< pointer:0x8948f68> ;, sorted = c(Unit,
Anything))

ddt
单位任何索引
1:A 3.4 0
2:A 6.9 0
3:A1 1.1 0
4:A1 2.2 0
5:B 2.0 0
6:B 3.0 0

索引列由每个单位填充1,2,3 ...。对于data.frame,我可以通过以下方式来实现:

  for(U in unique(ddt $ Unit)){
ddt [ddt $ Unit == U,] $ index = 1:length(ddt [ddt $ Unit == U,] $ Unit)
}

ddt
任何指数
1 A 3.4 1
3 A 6.9 2
4 A1 1.1 1
2 A1 2.2 2
5 B 2.0 1
6 B 3.0 2

但是如何使用data.table命令呢?谢谢你的帮助。


ddt [,indx:= 1:.N,by = Unit]
#单位任何indx
#1:A 3.4 1
#2:A 6.9 2
# A1 1.1 1
#4:A1 2.2 2
#5:B 2.0 1
#6:B 3.0 2


This question is related to: Add a column to a data frame that index the number of occurrences in a group I have following data.table sorted by first 2 columns.

ddt = structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
"A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
2, 3), index = c(0, 0, 0, 0, 0, 0)), .Names = c("Unit", "Anything", 
"index"), row.names = c(NA, -6L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
"Anything"))

ddt
   Unit Anything index
1:    A      3.4     0
2:    A      6.9     0
3:   A1      1.1     0
4:   A1      2.2     0
5:    B      2.0     0
6:    B      3.0     0

The index column is to be filled by 1,2,3... for each Unit. For a data.frame I can do it by :

for(U in unique(ddt$Unit)){
    ddt[ddt$Unit==U,]$index = 1:length(ddt[ddt$Unit==U,]$Unit)
}

ddt
  Unit Anything index
1    A      3.4     1
3    A      6.9     2
4   A1      1.1     1
2   A1      2.2     2
5    B      2.0     1
6    B      3.0     2

But how to do it using data.table commands? Thanks for your help.

解决方案

Try

 ddt[, indx:=1:.N, by=Unit]
 #     Unit Anything indx
 #1:    A      3.4    1
 #2:    A      6.9    2
 #3:   A1      1.1    1
 #4:   A1      2.2    2
 #5:    B      2.0    1
 #6:    B      3.0    2

这篇关于填写data.table中的索引列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆