如何使用proc比较更新数据集 [英] How to use proc compare to update dataset
问题描述
我想使用 proc compare
每天更新数据集。
I want to use proc compare
to update dataset on a daily basis.
work.HAVE1
Date Key Var1 Var2
01Aug2013 K1 a 2
01Aug2013 K2 a 3
02Aug2013 K1 b 4
工作。 HAVE2
Date Key Var1 Var2
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
Date
和键
唯一确定一个记录。
如何使用上面的两个表来构造下面的
Date
and Key
are uniquely determine one record.
How can I use the above two tables to construct the following
work.WANT
Date Key Var1 Var2
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
我不想删除以前的数据,然后重建。我想通过在底部附加新记录并调整 VAR1
或中的值来
。 修改
> VAR2
我在努力与 proc比较
但它只是不返回我想要的。
I don't want to delete the previous data and then rebuild it. I want to modify
it via append new records at the bottom and adjust the values in VAR1
or VAR2
.
I'm struggling with proc compare
but it just doesn't return what I want.
推荐答案
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.DIFF outnoequal outcomp;
id Date Key;
run;
这将在单个数据集WORK.DIFF中给出新的和更改的
This will give you new and changed (unequal records) in single dataset WORK.DIFF. You'll have to distinguish new vs changed yourself.
但是,你想要实现的是一个 MERGE
- 插入新的,覆盖现有的,虽然也许是由于性能原因等等,你不想重新创建完整的表。
However, what you want to achieve is actually a MERGE
- inserts new, overwrites existing, though maybe due to performance reasons etc. you don't want to re-create the full table.
data work.WANT;
merge work.HAVE1 work.HAVE2;
by Date Key;
run;
Edit1:
/* outdiff option will produce records with _type_ = 'DIF' for matched keys */
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.RESULT outnoequal outcomp outdiff;
id Date Key;
run;
data WORK.DIFF_KEYS; /* keys of changed records */
set WORK.RESULT;
where _type_ = 'DIF';
keep Date Key;
run;
/* split NEW and CHANGED */
data
WORK.NEW
WORK.CHANGED
;
merge
WORK.RESULT (where=( _type_ ne 'DIF'));
WORK.DIFF_KEYS (in = d)
;
by Date Key;
if d then output WORK.CHANGED;
else output WORK.NEW;
run;
Edit2:
APPEND
WORK.NEW到目标表。
Now you can just APPEND
the WORK.NEW to target table.
对于WORK.CHANGED - 使用 MODIFY
或 UPDATE
语句来更新记录。
根据更改的大小,您还可以考虑 PROC SQL; DELETE
可删除旧记录, PROC APPEND
可添加新值。
For WORK.CHANGED - either use MODIFY
or UPDATE
statement to update the records.
Depending on the size of the changes, you can also think about PROC SQL; DELETE
to delete old records and PROC APPEND
to add new values.
这篇关于如何使用proc比较更新数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!