从当前观察中读取下 k 个观察 [英] Reading next k observation from current observation

查看:39
本文介绍了从当前观察中读取下 k 个观察的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个非常相似的问题>

我的问题与上面链接中的问题有点不同.

背景

我有一个包含每小时数据的数据集.所以每个对象每天有 24 条记录.现在我想为每个对象创建 K 新列代表下一个 1,2,...K 每小时记录.如果不存在,用缺失值替换它们.

K 是动态的,由用户定义.

必须保留原始顺序.无论是在数据步骤中保证,还是在最后使用排序.

我正在寻找一种有效的方法来实现这一目标.

示例

原始数据:

对象小时值1 2.3A 2 2.3一个 3 4.0A 4 1.3

给定K = 2,期望的输出是

对象小时值 Value1 Value21 2.3 2.3 4.0A 2 2.3 4.0 1.3A 3 4.0 1.3 .一个 4 1.3 ..

可能的解决方案

  1. 逆序排序 -> 获取前 k 条记录 -> 重新排序.

当没有.观察的范围很大,这不应该是一个理想的方式.

  1. proc expand.我不熟悉它,因为它从未在我的电脑上获得许可.

  2. 在数据步骤中使用 point.

  3. retain 语句内的数据步骤.我不确定这是如何工作的.

解决方案

假设这是作为宏变量提供的,这很容易通过并排合并实现.当然比 K 的转置快得多,比总记录数大得多,而且可能比循环 POINT 快.

基本上,您将原始数据集合并到自身,并使用 FIRSTOBS 为每个连续的合并迭代将起点向下推一个.如果您有需要保护的 BY 组,这需要一些额外的工作,但这通常不太难管理.

以下是使用 SASHELP.CLASS 的示例:

%let K=5;%macro makemergesets(k=, datain=, varin=, keepin=);%do _i = 2 %to &k;&datain (firstobs=&_i rename=&varin.=&varin._&_i.keep=&keepin.&varin.)%结尾;% 修补 makemregsets;数据类_all;合并 sashelp.class%makemergesets(k=&k,datain=sashelp.class, varin=age,keepin=);跑步;

Here's a very similar question

My question is a bit different from the one in the above link.

Background

I have a data set contains hourly data. So each object has 24 records per day. Now I want to create K new columns represents next 1,2,...K hourly records for each object. If not exist, replace them with missing values.

K is dynamic and is defined by users.

The original order must be preserved. No matter it's guaranteed in the data steps or by using sorting in the end.

I'm looking for an efficient way to achieve this.

Example

Original data:

Object Hour Value 
A      1    2.3
A      2    2.3
A      3    4.0
A      4    1.3

Given K = 2, desired output is

Object Hour Value Value1 Value2
A      1    2.3   2.3    4.0
A      2    2.3   4.0    1.3
A      3    4.0   1.3    .
A      4    1.3   .      .

Possible solutions

  1. sort in reverse order -> obtain previous k records -> sort them back.

When the no. of observation is large, this shouldn't be an ideal way.

  1. proc expand. I don't familiar with it cause it's never licensed on my pc.

  2. Using point in data step.

  3. retain statement inside data step. I'm not sure how this works.

解决方案

Assuming this is provided as a macro variable, this is pretty easily done with a side to side merge-ahead. Certainly faster than a transpose for K much larger than the total record count, and probably faster than looping POINTs.

Basically you merge the original dataset to itself, and use FIRSTOBS to push the starting point down one for each successive merge iteration. This needs a bit of extra work if you have BY groups that need protecting, but that's usually not too hard to manage.

Here's an example using SASHELP.CLASS:

%let K=5;

%macro makemergesets(k=, datain=, varin=, keepin=);
  %do _i = 2 %to &k;
    &datain (firstobs=&_i rename=&varin.=&varin._&_i. keep=&keepin. &varin.)
  %end;
%mend makemregesets;

data class_all;
  merge sashelp.class
    %makemergesets(k=&k,datain=sashelp.class, varin=age,keepin=)
  ;
run;

这篇关于从当前观察中读取下 k 个观察的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆