我收到错误消息“无法将Delta表的时间旅行到版本X".而查看Azure Databricks的历史记录时可以看到版本X [英] I receive the error "Cannot time travel Delta table to version X" whereas I can see the version X when looking at the history on Azure Databricks

查看:95
本文介绍了我收到错误消息“无法将Delta表的时间旅行到版本X".而查看Azure Databricks的历史记录时可以看到版本X的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在三角洲湖中有一张桌子,这些桌子具有以下tblproperties:

I have a table in delta lake which has these tblproperties:

我正在尝试访问上个月发布的322版本.

I'm trying to access a version which was there last month, the 322.

当我查看历史记录时,可以看到它:

When I look at the history, I can see it:

但是当我尝试使用这样的命令访问它时:

But when I try to access it with such a command:

spark.read.format("delta").option("versionAsOf", 322).load(path)

我收到此错误:

AnalysisException: Cannot time travel Delta table to version 322. Available versions: [330, 341].;

我不明白这个问题.我正在使用Azure Databricks.

I can't understand the problem. I'm using Azure Databricks.

推荐答案

我不确定该错误.DeltaLake中有一个打开的拉取请求可能会解决该问题: https://github.com/delta-io/delta/pull/627 .

I'm not sure to understand this bug. There's an open pull request in DeltaLake that might solve the problem: https://github.com/delta-io/delta/pull/627.

直到那时,Databricks的一个人给了我一个解决方法:将 delta.checkpointRetentionDuration 设置为X天.这样可以使您的检查点有足够的时间来访问旧版本.

Till then, a person from Databricks gave me a workaround: set delta.checkpointRetentionDuration to X days. That will keep your checkpoints enough longer to have access to older versions.

然后,您必须在增量表中启动类似的内容:

Then, you must launch something like that in your delta table:

spark.sql(        f"""
        ALTER TABLE delta.`path`
            SET TBLPROPERTIES (
                delta.logRetentionDuration = 'interval X days',
                delta.deletedFileRetentionDuration = 'interval X days',
                delta.checkpointRetentionDuration = 'X days'
            )
        """
)

它将版本保留到X天.

这篇关于我收到错误消息“无法将Delta表的时间旅行到版本X".而查看Azure Databricks的历史记录时可以看到版本X的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆