如何使用 proc compare 来更新数据集 [英] How to use proc compare to update dataset
问题描述
我想使用 proc compare
来每天更新数据集.
I want to use proc compare
to update dataset on a daily basis.
work.HAVE1
Date Key Var1 Var2
01Aug2013 K1 a 2
01Aug2013 K2 a 3
02Aug2013 K1 b 4
work.HAVE2
Date Key Var1 Var2
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
Date
和Key
唯一确定一条记录.如何使用上面的两个表来构造以下
Date
and Key
are uniquely determine one record.
How can I use the above two tables to construct the following
work.WANT
Date Key Var1 Var2
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
我不想删除以前的数据然后重新构建它.我想通过在底部追加新记录来修改
它并调整VAR1
或VAR2
中的值.我在 proc compare
上苦苦挣扎,但它并没有返回我想要的.
I don't want to delete the previous data and then rebuild it. I want to modify
it via append new records at the bottom and adjust the values in VAR1
or VAR2
.
I'm struggling with proc compare
but it just doesn't return what I want.
推荐答案
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.DIFF outnoequal outcomp;
id Date Key;
run;
这将在单个数据集 WORK.DIFF 中为您提供新的和更改的(不相等的记录).你必须自己区分新的和改变的.
This will give you new and changed (unequal records) in single dataset WORK.DIFF. You'll have to distinguish new vs changed yourself.
但是,您想要实现的实际上是 MERGE
- 插入新的,覆盖现有的,尽管可能是由于性能原因等.您不想重新创建完整的表.
However, what you want to achieve is actually a MERGE
- inserts new, overwrites existing, though maybe due to performance reasons etc. you don't want to re-create the full table.
data work.WANT;
merge work.HAVE1 work.HAVE2;
by Date Key;
run;
/* outdiff option will produce records with _type_ = 'DIF' for matched keys */
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.RESULT outnoequal outcomp outdiff;
id Date Key;
run;
data WORK.DIFF_KEYS; /* keys of changed records */
set WORK.RESULT;
where _type_ = 'DIF';
keep Date Key;
run;
/* split NEW and CHANGED */
data
WORK.NEW
WORK.CHANGED
;
merge
WORK.RESULT (where=( _type_ ne 'DIF'));
WORK.DIFF_KEYS (in = d)
;
by Date Key;
if d then output WORK.CHANGED;
else output WORK.NEW;
run;
现在您只需 APPEND
WORK.NEW 到目标表.
Now you can just APPEND
the WORK.NEW to target table.
对于 WORK.CHANGED - 使用 MODIFY
或 UPDATE
语句来更新记录.根据更改的大小,您还可以考虑 PROC SQL;DELETE
删除旧记录,PROC APPEND
添加新值.
For WORK.CHANGED - either use MODIFY
or UPDATE
statement to update the records.
Depending on the size of the changes, you can also think about PROC SQL; DELETE
to delete old records and PROC APPEND
to add new values.
这篇关于如何使用 proc compare 来更新数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!