即使不满足条件,Delta Lake MERGE/UPDATE也会重写数据 [英] Delta Lake MERGE / UPDATE rewriting data even when condition is not met

查看:66
本文介绍了即使不满足条件,Delta Lake MERGE/UPDATE也会重写数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试减少不必要的数据写入,仅在特定条件下写入三角洲.为什么这些语句总是重写数据?

I'm trying to reduce unnecessary writing of data and only write to the delta lake under a specific condition. Why do these statements always rewrite the data?

%sql
MERGE INTO tblTest as target
USING temp_Source as source
ON target.ID = source.ID
WHEN MATCHED AND 1 = 0
THEN UPDATE SET *

或这个

deltaTable.alias("target").merge(
  source = dfSource.alias("source"),
  condition = expr("source.ID = target.ID")) \
.whenMatchedUpdateAll('1 = 0') \
.execute()

我希望只会更新表元数据,并且不会将源中的任何数据写入目标.

I'm expecting that only table metadata would be updated and no data from the source would be written to the target.

推荐答案

这是Delta的一种已知行为-它会重写 ON 子句中具有匹配记录的每个文件,而不考虑条件何时匹配/何时不匹配.如果要避免这种情况,请将您的条件移到 ON 子句中.

That's a known behavior of the Delta - it rewrites every file that hase matching record in the ON clause, regardless of the condition for WHEN MATCHED / WHEN NOT MATCHED. If you want to avoid this, move your condition into the ON clause.

这篇关于即使不满足条件,Delta Lake MERGE/UPDATE也会重写数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆