如何使用proc比较更新数据集 [英] How to use proc compare to update dataset

查看:157
本文介绍了如何使用proc比较更新数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 proc compare 每天更新数据集。

I want to use proc compare to update dataset on a daily basis.

work.HAVE1

Date        Key Var1 Var2 
01Aug2013   K1   a    2
01Aug2013   K2   a    3
02Aug2013   K1   b    4

工作。 HAVE2

Date        Key Var1 Var2 
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1

Date 唯一确定一个记录。
如何使用上面的两个表来构造下面的

Date and Key are uniquely determine one record. How can I use the above two tables to construct the following

work.WANT

Date        Key Var1 Var2 
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1

我不想删除以前的数据,然后重建。我想通过在底部附加新记录并调整 VAR1 中的值来修改 > VAR2
我在努力与 proc比较但它只是不返回我想要的。

I don't want to delete the previous data and then rebuild it. I want to modify it via append new records at the bottom and adjust the values in VAR1 or VAR2. I'm struggling with proc compare but it just doesn't return what I want.

推荐答案

proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.DIFF outnoequal outcomp;
id Date Key;
run;

这将在单个数据集WORK.DIFF中给出新的和更改的

This will give you new and changed (unequal records) in single dataset WORK.DIFF. You'll have to distinguish new vs changed yourself.

但是,你想要实现的是一个 MERGE - 插入新的,覆盖现有的,虽然也许是由于性能原因等等,你不想重新创建完整的表。

However, what you want to achieve is actually a MERGE - inserts new, overwrites existing, though maybe due to performance reasons etc. you don't want to re-create the full table.

data work.WANT;
    merge work.HAVE1 work.HAVE2;
    by Date Key;
run;

Edit1:

/* outdiff option will produce records with _type_ = 'DIF' for matched keys */
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.RESULT outnoequal outcomp outdiff;
id Date Key;
run;


data WORK.DIFF_KEYS;  /* keys of changed records */
    set WORK.RESULT;
    where _type_ = 'DIF';
    keep Date Key;
run;

/* split NEW and CHANGED */
data
    WORK.NEW
    WORK.CHANGED
;
 merge
      WORK.RESULT (where=( _type_ ne 'DIF'));
        WORK.DIFF_KEYS (in = d)
    ;
  by Date Key;
  if d then output WORK.CHANGED;
  else output WORK.NEW;
run;

Edit2:

APPEND WORK.NEW到目标表。

Now you can just APPEND the WORK.NEW to target table.

对于WORK.CHANGED - 使用 MODIFY UPDATE 语句来更新记录。
根据更改的大小,您还可以考虑 PROC SQL; DELETE 可删除旧记录, PROC APPEND 可添加新值。

For WORK.CHANGED - either use MODIFY or UPDATE statement to update the records. Depending on the size of the changes, you can also think about PROC SQL; DELETE to delete old records and PROC APPEND to add new values.

这篇关于如何使用proc比较更新数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆