SAS:使用不带 set 语句的 lag 函数(模拟时间序列数据.) [英] SAS: Using the lag function without a set statement (to simulate time series data.)

查看:18
本文介绍了SAS:使用不带 set 语句的 lag 函数(模拟时间序列数据.)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人能解释一下为什么以下两段代码给出不同的结果吗?我想在 SAS 中模拟一些简单的时间序列过程,但我正在努力处理滞后函数.

Could someone explain why the following two pieces of code give different results? I would like to simulate some simple time series processes in SAS, but I'm struggling with the lag function.

具体来说,在程序 1 中,变量 b 没有包含数据,这是意料之外的.在程序 2 中,滞后函数按预期工作.

Specifically, in program 1, the variable b contains no data, which is unexpected. In program 2, the lag function works as expected.

/*Program 1*/
data lagtest;
a = 1;
b=lag(a);
output;

a = 2;
b= lag(a);
output;

a = 3;
b= lag(a);
output;
run;


/*Program 2*/
data lagtest2;
input a;
datalines;
1
2
3
;
run;

data lagtest2;
set lagtest2;
b= lag(a);
run;

我一直在阅读有关 lag 函数的信息,但在不采用输入数据集的数据步中找不到有关其使用的参考.

I've been reading about the lag function, but can't find references to its use in a datastep that does not take an input dataset.

非常感谢您的帮助.

推荐答案

Keith 大致正确,因为他展示了正确的方法,但推理并不准确.LAG 作用于数据;输入和输出是不相关的(也不是真正有意义的区别).事实上,仅使用以编程方式提供的数据就可以完成这项工作.

Keith's roughly correct in that the correct approach is what he shows, but the reasoning isn't accurate. LAG works on data; input and output is irrelevant (and not really a meaningful distinction). It is, in fact, quite possible to make this work with only programmatically provided data.

data lagtest;
do a=1 to 3;
  b=lag(a);
  output;
end;
run;

同样,可以让第二个例子工作,用一个有点荒谬的例子:

Similarly, it's possible to make the second example not work, with a somewhat absurd example:

data lagtest2;
 p=1;
 set lagtest2 point=p;
 b= lag(a);
 output;
 p=2;
 set lagtest2 point=p;
 b=lag(a);
 output;
 p=3;
 set lagtest2 point=p;
 b=lag(a);
 output;
 stop;  
run;

第一个示例不起作用的原因不是数据源;这是滞后呼叫的数量.最常见的错误之一是认为 lag 从以前的记录中检索值;那不是真的.lag 的工作方式是每次调用 lag 都会创建一个队列.每次遇到该 lag 语句时,参数中的任何值都会被推送到队列中,如果队列至少是定义的长度+1 长,则队列前面的值将被弹出.(对于 laglag1,队列必须是 2 长;对于 lag2,它必须是 3 长;等等 - 即函数加上刚刚弹出的值).

The reason the first example doesn't work isn't the source of data; it's the number of lag calls. One of the most common mistakes is to believe that lag retrieves a value from previous record; that isn't true. The way lag works is that each call to lag creates a queue. Each time that lag statement is encountered, whatever value is in the argument is pushed onto the queue, and if the queue is at least the defined length+1 long, the value at the front of the queue is popped off. (For lag or lag1, the queue must be 2 long; for lag2 it must be 3 long; etc. - ie, the number of the function plus the value just popped on).

在您的第一个示例中,您调用 lag 三次,因此创建了三个单独的队列,并且它们都不会被第二次调用.在您的第二个示例中,您调用 lag 一次,因此创建了一个队列,并调用了 3 次.

In your first example, you call lag three times, so three separate queues are created, and none of them ever are called a second time. In your second example, you call lag once, so one queue is created, and it is called three times.

这篇关于SAS:使用不带 set 语句的 lag 函数(模拟时间序列数据.)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆