R中的保留和滞后函数为SAS [英] Retain and lag function in R as SAS

查看:506
本文介绍了R中的保留和滞后函数为SAS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中寻找一个类似 lag1 lag2 retain

I am looking for a function in R similar to lag1, lag2 and retain functions in SAS which I can use with data.tables.

我知道有一些函数像 embed lag ,但它们不返回单个值或上一个值。它们返回一组完整的向量。

I know there are functions like embed and lag in R but they don't return a single value or the previous value . They return a complete set of vectors.

在R中有什么可以用data.table?

Is there anything in R which I can use with data.table?

有关SAS功能的更多信息:

More info on the SAS functions :

  • Retain
  • Lag

推荐答案

您必须注意R的工作方式与SAS中的数据步骤非常不同。 SAS中的 lag 函数用于数据步骤,并在该数据步骤的隐式循环结构中使用。对于 retain 函数也是如此,只是在通过数据循环时保持值不变。

You have to be aware that R works very different from the data step in SAS. The lag function in SAS is used in the data step, and is used within the implicit loop structure of that data step. The same goes for the retain function, which simply keeps the value constant when going through the data looping.

R另一方面工作完全矢量化。

R on the other hand works completely vectorized. This means that you have to rethink what you want to do, and adapt accordingly.


  • 保留在R中是无用的,因为R默认循环使用参数。如果你想明确这样做,你可以看看 rep()构造一个具有常量值和一定长度的向量。

  • lag 是使用索引的问题,只是移动向量中所有值的位置。为了保持相同长度的向量,需要添加一些 NA 并删除一些额外的值。

  • retain is simply useless in R, as R recycles arguments by default. If you want to do this explicitly, you might look at eg rep() to construct a vector with constant values and a certain length.
  • lag is a matter of using indices, and just shifting position of all values in a vector. In order to keep a vector of the same length, you need to add some NA and remove some extra values.

一个简单的例子:这个SAS代码延迟一个变量 x 变量 year 具有常量值:

A simple example: This SAS code lags a variable x and adds a variable year that has a constant value:

data one;
   retain year 2013;
   input x @@;
   y=lag1(x);
   z=lag2(x);
   datalines;
1 2 3 4 5 6
;

在R中,您可以这样编写自己的滞后函数:

In R, you could write your own lag function like this:

mylag <- function(x,k) c(rep(NA,k),head(x,-k))

这个单行在向量的开头加上k乘NA,并从向量中删除最后的k个值。结果是由SAS中的 lag1 等给出的滞后向量。

This single line adds k times NA at the beginning of the vector, and drops the last k values from the vector. The result is a lagged vector as given by lag1 etc. in SAS.

这允许像: p>

this allows something like :

nrs <- 1:6 # equivalent to datalines
one <- data.frame(
   x = nrs,
   y = mylag(nrs,1),
   z = mylag(nrs,2),
   year = 2013  # R automatically loops, so no extra command needed
)

结果是:

> one
  x  y  z year
1 1 NA NA 2013
2 2  1 NA 2013
3 3  2  1 2013
4 4  3  2 2013
5 5  4  3 2013
6 6  5  4 2013

c> data.table 对象。这里的重要注意事项是重新思考你的策略:与使用SAS中的DATA步骤一样循环思考,在使用R时,你必须开始思考向量和索引。

Exactly the same would work with a data.table object. The important note here is to rethink your strategy: Instead of thinking loopwise as you do with the DATA step in SAS, you have to start thinking in terms of vectors and indices when using R.

这篇关于R中的保留和滞后函数为SAS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆