即使不满足条件,Delta Lake MERGE/UPDATE也会重写数据 [英] Delta Lake MERGE / UPDATE rewriting data even when condition is not met
本文介绍了即使不满足条件,Delta Lake MERGE/UPDATE也会重写数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试减少不必要的数据写入,仅在特定条件下写入三角洲.为什么这些语句总是重写数据?
I'm trying to reduce unnecessary writing of data and only write to the delta lake under a specific condition. Why do these statements always rewrite the data?
%sql
MERGE INTO tblTest as target
USING temp_Source as source
ON target.ID = source.ID
WHEN MATCHED AND 1 = 0
THEN UPDATE SET *
或这个
deltaTable.alias("target").merge(
source = dfSource.alias("source"),
condition = expr("source.ID = target.ID")) \
.whenMatchedUpdateAll('1 = 0') \
.execute()
我希望只会更新表元数据,并且不会将源中的任何数据写入目标.
I'm expecting that only table metadata would be updated and no data from the source would be written to the target.
推荐答案
这是Delta的一种已知行为-它会重写 ON
子句中具有匹配记录的每个文件,而不考虑条件何时匹配
/何时不匹配
.如果要避免这种情况,请将您的条件移到 ON
子句中.
That's a known behavior of the Delta - it rewrites every file that hase matching record in the ON
clause, regardless of the condition for WHEN MATCHED
/ WHEN NOT MATCHED
. If you want to avoid this, move your condition into the ON
clause.
这篇关于即使不满足条件,Delta Lake MERGE/UPDATE也会重写数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文