数据帧滞后 [英] Lag in dataframe

查看:176
本文介绍了数据帧滞后的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,如

  ID_CASE   Month   
CS00000026A 201301  
CS00000026A 201302  
CS00000026A 201303  
CS00000026A 201304  
CS00000026A 201305  
CS00000026A 201306  
CS00000026A 201307  
CS00000026A 201308  
CS00000026A 201309  
CS00000026A 201310  
CS00000191C 201302  
CS00000191C 201303  
CS00000191C 201304  
CS00000191C 201305  
CS00000191C 201306  
CS00000191C 201307  
CS00000191C 201308  
CS00000191C 201309  
CS00000191C 201310  

我希望最终的数据框有三个附加列,如

I want the final data frame to have three additional column like

  ID_CASE   Month   Lag_1   Lag_2   Lag_3
CS00000026A 201301  NA      NA      NA
CS00000026A 201302  201301  NA      NA
CS00000026A 201303  201202  201201  NA
CS00000026A 201304  201203  201202  201201
CS00000026A 201305  201204  201203  201202
CS00000026A 201306  201305  201304  201303
CS00000026A 201307  201306  201305  201304
CS00000026A 201308  201307  201306  201305
CS00000026A 201309  201308  201307  201306
CS00000026A 201310  201309  201308  201307
CS00000191C 201302  NA       NA     NA
CS00000191C 201303  201302   NA     NA
CS00000191C 201304  201303  201302      NA
CS00000191C 201305  201304  201303  201302
CS00000191C 201306  201305  201304  201303
CS00000191C 201307  201306  201305  201304
CS00000191C 201308  201307  201306  201305
CS00000191C 201309  201308  201307  201306
CS00000191C 201310  201309  201308  201307

其中


  • Lag_1滞后1个月

  • Lag_2滞后2个月

  • Lag_3滞后3个月。

我已使用以下代码至少获得Lag_1

I have used the following code to atleast get Lag_1

df <- ddply(df,.(ID_CASE),transform,
                  Lag_1 <- c(NA,Month[-nrow(df)])) 

但是这不给我所需的输出Lag_1。

But this does not give me the desired output for Lag_1.

我也试过在
R数据帧中的延迟

如果我有一个日期对象而不是 int 列'Month',如当前示例中那样。

And how can this be done if I have a date object instead of an int column 'Month' as in the current example?

对此的任何帮助将不胜感激。

Any help on this will be appreciated.

推荐答案

从下一个版本的data.table(v1.9.6 +)可以使用 shift()

From the next version of data.table (v1.9.6+) you can use shift():

require(data.table)
setDT(df)[, paste("lag", 1:3, sep="_") := shift(Month, 1:3), by=ID_CASE]

您可以在github项目页面上的当前开发版本(v1.9.5)中尝试过

You can already try it out from the current dev version (v1.9.5) on github project page.

这篇关于数据帧滞后的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆