选择在数据步上具有最大值的观测值 [英] Choosing observations, which have the maximum value on the data step

查看:19
本文介绍了选择在数据步上具有最大值的观测值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

:

date
05DEC2016   a
28NOV2016   b
22DEC2016   c
23JAN2017   d
12JAN2017   e
09JAN2017   f
25DEC2016   g

我需要创建一个新表:

data new;
     set old;
     ****the right code***
run;

这样,new 包含具有最新日期的行.所以 new 是:

so that, new contains rows, which have the latest date. So new is:

23JAN2017   d

我首先尝试查找最新日期:latest_date = max(date),然后使用简单的where=,但正如您所见-这不起作用

I tried first finding the latest date: latest_date = max(date) , and then using a simple where=, but as you can see- this doesnt work

推荐答案

SQL 是最简单的方法.数据步骤中的 max 函数不能跨行工作(因为数据步骤中没有函数);它会给你最多的许多列.在数据步骤中,您还可以转置、使用散列或仅使用跨行保留保存的值并进行比较;但是SQL很简单,使用have.

SQL is the easiest way to do this. max function in the data step doesn't work across rows (as no function in the data step does); it would give you the maximum of many columns. In the data step you could also transpose, use a hash, or just retain the saved values using across rows and compare; but SQL is very simple, using having.

data have;
format date date9.;
input date :date9. code :$1.;
datalines;
05DEC2016   a
28NOV2016   b
22DEC2016   c
23JAN2017   d
12JAN2017   e
09JAN2017   f
25DEC2016   g
;;;;
run;

proc sql;
  select date, code
    from have
    having date=max(date);
quit;

要以最简单的方式在数据步骤中进行,但您必须使用 retain,大致如下:

To do in the data step in the most simple way though you have to use retain, something along these lines:

data want;
  set have end=eof;
  retain  max_Date max_code;
  if date > max_date then do;
    max_date = date;
    max_code = code;
  end;
  if eof then output;
  format max_date date9.;
  rename 
    max_date=date 
    max_code=code
  ;
  keep max_date max_code;
run;

最后,真正最简单的方法是使用 idgroup 选项在 proc 意味着.

Finally, the really easiest way to do this specific thing is to use the idgroup option in proc means.

proc means data=have;
  var date;
  output out=want 
         idgroup(max(date) /* asks for the row with the maximum date */
        out[1]             /* asks for the 1 maximum date - could get more */
        (date code)=       /* asks for these two variables to be output */
  );
run;

这篇关于选择在数据步上具有最大值的观测值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆