动态子集数据表 [英] Dynamically subsetting a data table

查看:73
本文介绍了动态子集数据表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于动态子集数据表的问题。我知道stackoverflow上有很多线程,它们的名称相似,但不幸的是,它们并没有引导我找到所需的解决方案。

I have a question concerning dynamically subsetting a data table. I know that there are numerous threads on stackoverflow which are denominated similarly but unfortunately they didn't lead me to the wanted solution.

示例数据集:

require(data.table)
dt <- data.table(date=c(rep(1,5),rep(2,5)),id=rep(1:5,2),var=c(1:10))

对于每个ID ,我想找到之前所有期间所有其他 ID的子集。在示例数据集中,有5个ID和两个句点。如果在周期2中查看ID = 5,则对应的子集将是ID = {1,2,3,4)和date = 1的子集。在这个简单的数据集中,我当然可以手动进行编码:

For each ID I would like to find the subset of all other IDs of all periods before. In the example data set there are 5 IDs and two periods. If one looks at ID=5 in period 2 the corresponding subset would be that of ID={1,2,3,4) and date=1. In this simple data set I of course can code this by hand:

dt[,dt[-.I][date<2],by=id]

但是我想自动执行此操作。我尝试过

I however would like to do this automatically. I tried something like

dt[,dt[-.I][date < unique(dt$date[.I])],by=id] 

但这不是

任何有用的评论都将受到赞赏!谢谢!

Any helpful comments are appreciated! Thanks!

推荐答案

我认为这是更快的解决方案:

I think this is the faster solution:

dta <- data.table(date=c(rep(1,5),rep(2,5)),id=rep(1:5,2),var=c(1:10))
dta[,dta[dta[.I]$id!=dta$id & dta[.I]$date>dta$date],by=list(id,date)]

任何关于如何使此代码更快的评论都受到高度赞赏。

Any comments on how to make this code even faster is highly appreciated.

这篇关于动态子集数据表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆