SAS:使用不带 set 语句的 lag 函数(模拟时间序列数据.) [英] SAS: Using the lag function without a set statement (to simulate time series data.)
问题描述
有人能解释一下为什么以下两段代码给出不同的结果吗?我想在 SAS 中模拟一些简单的时间序列过程,但我正在努力处理滞后函数.
Could someone explain why the following two pieces of code give different results? I would like to simulate some simple time series processes in SAS, but I'm struggling with the lag function.
具体来说,在程序 1 中,变量 b 没有包含数据,这是意料之外的.在程序 2 中,滞后函数按预期工作.
Specifically, in program 1, the variable b contains no data, which is unexpected. In program 2, the lag function works as expected.
/*Program 1*/
data lagtest;
a = 1;
b=lag(a);
output;
a = 2;
b= lag(a);
output;
a = 3;
b= lag(a);
output;
run;
/*Program 2*/
data lagtest2;
input a;
datalines;
1
2
3
;
run;
data lagtest2;
set lagtest2;
b= lag(a);
run;
我一直在阅读有关 lag 函数的信息,但在不采用输入数据集的数据步中找不到有关其使用的参考.
I've been reading about the lag function, but can't find references to its use in a datastep that does not take an input dataset.
非常感谢您的帮助.
推荐答案
Keith 大致正确,因为他展示了正确的方法,但推理并不准确.LAG
作用于数据;输入和输出是不相关的(也不是真正有意义的区别).事实上,仅使用以编程方式提供的数据就可以完成这项工作.
Keith's roughly correct in that the correct approach is what he shows, but the reasoning isn't accurate. LAG
works on data; input and output is irrelevant (and not really a meaningful distinction). It is, in fact, quite possible to make this work with only programmatically provided data.
data lagtest;
do a=1 to 3;
b=lag(a);
output;
end;
run;
同样,可以让第二个例子不工作,用一个有点荒谬的例子:
Similarly, it's possible to make the second example not work, with a somewhat absurd example:
data lagtest2;
p=1;
set lagtest2 point=p;
b= lag(a);
output;
p=2;
set lagtest2 point=p;
b=lag(a);
output;
p=3;
set lagtest2 point=p;
b=lag(a);
output;
stop;
run;
第一个示例不起作用的原因不是数据源;这是滞后呼叫的数量.最常见的错误之一是认为 lag
从以前的记录中检索值;那不是真的.lag
的工作方式是每次调用 lag
都会创建一个队列.每次遇到该 lag 语句时,参数中的任何值都会被推送到队列中,如果队列至少是定义的长度+1 长,则队列前面的值将被弹出.(对于 lag
或 lag1
,队列必须是 2 长;对于 lag2
,它必须是 3 长;等等 - 即函数加上刚刚弹出的值).
The reason the first example doesn't work isn't the source of data; it's the number of lag calls. One of the most common mistakes is to believe that lag
retrieves a value from previous record; that isn't true. The way lag
works is that each call to lag
creates a queue. Each time that lag statement is encountered, whatever value is in the argument is pushed onto the queue, and if the queue is at least the defined length+1 long, the value at the front of the queue is popped off. (For lag
or lag1
, the queue must be 2 long; for lag2
it must be 3 long; etc. - ie, the number of the function plus the value just popped on).
在您的第一个示例中,您调用 lag
三次,因此创建了三个单独的队列,并且它们都不会被第二次调用.在您的第二个示例中,您调用 lag
一次,因此创建了一个队列,并调用了 3 次.
In your first example, you call lag
three times, so three separate queues are created, and none of them ever are called a second time. In your second example, you call lag
once, so one queue is created, and it is called three times.
这篇关于SAS:使用不带 set 语句的 lag 函数(模拟时间序列数据.)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!