如何使用 proc compare 来更新数据集 [英] How to use proc compare to update dataset

查看:19
本文介绍了如何使用 proc compare 来更新数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 proc compare 来每天更新数据集.

I want to use proc compare to update dataset on a daily basis.

work.HAVE1

Date        Key Var1 Var2 
01Aug2013   K1   a    2
01Aug2013   K2   a    3
02Aug2013   K1   b    4

work.HAVE2

Date        Key Var1 Var2 
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1

DateKey唯一确定一条记录.如何使用上面的两个表来构造以下

Date and Key are uniquely determine one record. How can I use the above two tables to construct the following

work.WANT

Date        Key Var1 Var2 
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1

我不想删除以前的数据然后重新构建它.我想通过在底部追加新记录来修改它并调整VAR1VAR2中的值.我在 proc compare 上苦苦挣扎,但它并没有返回我想要的.

I don't want to delete the previous data and then rebuild it. I want to modify it via append new records at the bottom and adjust the values in VAR1 or VAR2. I'm struggling with proc compare but it just doesn't return what I want.

推荐答案

proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.DIFF outnoequal outcomp;
id Date Key;
run;

这将在单个数据集 WORK.DIFF 中为您提供新的和更改的(不相等的记录).你必须自己区分新的和改变的.

This will give you new and changed (unequal records) in single dataset WORK.DIFF. You'll have to distinguish new vs changed yourself.

但是,您想要实现的实际上是 MERGE - 插入新的,覆盖现有的,尽管可能是由于性能原因等.您不想重新创建完整的表.

However, what you want to achieve is actually a MERGE - inserts new, overwrites existing, though maybe due to performance reasons etc. you don't want to re-create the full table.

data work.WANT;
    merge work.HAVE1 work.HAVE2;
    by Date Key;
run;

/* outdiff option will produce records with _type_ = 'DIF' for matched keys */
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.RESULT outnoequal outcomp outdiff;
id Date Key;
run;


data WORK.DIFF_KEYS;  /* keys of changed records */
    set WORK.RESULT;
    where _type_ = 'DIF';
    keep Date Key;
run;

/* split NEW and CHANGED */
data
    WORK.NEW
    WORK.CHANGED
;
 merge
      WORK.RESULT (where=( _type_ ne 'DIF'));
        WORK.DIFF_KEYS (in = d)
    ;
  by Date Key;
  if d then output WORK.CHANGED;
  else output WORK.NEW;
run;

现在您只需 APPEND WORK.NEW 到目标表.

Now you can just APPEND the WORK.NEW to target table.

对于 WORK.CHANGED - 使用 MODIFYUPDATE 语句来更新记录.根据更改的大小,您还可以考虑 PROC SQL;DELETE 删除旧记录,PROC APPEND 添加新值.

For WORK.CHANGED - either use MODIFY or UPDATE statement to update the records. Depending on the size of the changes, you can also think about PROC SQL; DELETE to delete old records and PROC APPEND to add new values.

这篇关于如何使用 proc compare 来更新数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆