按组依次查找下一个日期 [英] Find next date in series by group

查看:44
本文介绍了按组依次查找下一个日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些这样的数据:

sample.data <- rbind(data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"),
                 Group=c("A","B","C","D"), rnorm(20, 5)),
                 data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"),
                 Group=c("A","B","C","D"), rnorm(20, 3))
                 )

我想创建一个 end.date 列,该列等于每个组的下一个最早的 start.date 值.

I would like to create an end.date column that equals the next earliest start.date value for each group.

例如, Group == A 的第一个 start.date 2010-01-01 . Group == A 的下一个最早的 start.date 2011-01-01 .因此,当按 Group :

So, for example, the first start.date for Group==A is 2010-01-01. The next earliest start.date for Group==A is 2011-01-01. So the final result should look like this when sorted by Group:

                start.date Group   end.date
                2010-01-01     A 2011-01-01
                2010-01-01     A 2011-01-01
                2011-01-01     A 2012-01-01
                2011-01-01     A 2012-01-01
                2012-01-01     A 2013-01-01
                2012-01-01     A 2013-01-01
                2013-01-01     A 2014-01-01
                2013-01-01     A 2014-01-01
                2014-01-01     A         NA
                2014-01-01     A         NA
                2010-04-01     B 2011-04-01
                2010-04-01     B 2011-04-01
                2011-04-01     B 2012-04-01
                2011-04-01     B 2012-04-01

以此类推.理想情况下,我想参考

And so on. Ideally I would like to do this by reference, like

sample.data [,end.date:= EXPRESSION]

但是对于从哪里开始感到迷茫.感谢您的帮助.

but am at a loss for where to start. Thanks for any help.

推荐答案

好吧:

events = unique(sample.data[ , .(Group, start.date) ])
events[, next.date := shift(start.date, type="lead"), by=Group]

sample.data[events, on=c("Group", "start.date"), end.date := next.date ]

我认为,OP应该始终具有类似于 events 的表格,以符合数据库设计/

In my opinion, the OP should have a table like events anyways, in keeping with database design / tidy data. The result looks like

> sample.data[ order(Group, start.date) ]

    start.date Group   end.date
 1: 2010-01-01     A 2011-01-01
 2: 2010-01-01     A 2011-01-01
 3: 2011-01-01     A 2012-01-01
 4: 2011-01-01     A 2012-01-01
 5: 2012-01-01     A 2013-01-01
 6: 2012-01-01     A 2013-01-01
 7: 2013-01-01     A 2014-01-01
 8: 2013-01-01     A 2014-01-01
 9: 2014-01-01     A       <NA>
10: 2014-01-01     A       <NA>
11: 2010-04-01     B 2011-04-01
12: 2010-04-01     B 2011-04-01
...

这篇关于按组依次查找下一个日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆