我收到错误消息“无法将Delta表的时间旅行到版本X".而查看Azure Databricks的历史记录时可以看到版本X [英] I receive the error "Cannot time travel Delta table to version X" whereas I can see the version X when looking at the history on Azure Databricks
问题描述
我在三角洲湖中有一张桌子,这些桌子具有以下tblproperties:
I have a table in delta lake which has these tblproperties:
我正在尝试访问上个月发布的322版本.
I'm trying to access a version which was there last month, the 322.
当我查看历史记录时,可以看到它:
When I look at the history, I can see it:
但是当我尝试使用这样的命令访问它时:
But when I try to access it with such a command:
spark.read.format("delta").option("versionAsOf", 322).load(path)
我收到此错误:
AnalysisException: Cannot time travel Delta table to version 322. Available versions: [330, 341].;
我不明白这个问题.我正在使用Azure Databricks.
I can't understand the problem. I'm using Azure Databricks.
推荐答案
我不确定该错误.DeltaLake中有一个打开的拉取请求可能会解决该问题: https://github.com/delta-io/delta/pull/627 .
I'm not sure to understand this bug. There's an open pull request in DeltaLake that might solve the problem: https://github.com/delta-io/delta/pull/627.
直到那时,Databricks的一个人给了我一个解决方法:将 delta.checkpointRetentionDuration 设置为X天.这样可以使您的检查点有足够的时间来访问旧版本.
Till then, a person from Databricks gave me a workaround: set delta.checkpointRetentionDuration to X days. That will keep your checkpoints enough longer to have access to older versions.
然后,您必须在增量表中启动类似的内容:
Then, you must launch something like that in your delta table:
spark.sql( f"""
ALTER TABLE delta.`path`
SET TBLPROPERTIES (
delta.logRetentionDuration = 'interval X days',
delta.deletedFileRetentionDuration = 'interval X days',
delta.checkpointRetentionDuration = 'X days'
)
"""
)
它将版本保留到X天.
这篇关于我收到错误消息“无法将Delta表的时间旅行到版本X".而查看Azure Databricks的历史记录时可以看到版本X的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!