从EMR迁移到AWS Glue后在Spark SQL中找不到表 [英] Tables not found in Spark SQL after migrating from EMR to AWS Glue

查看：181 发布时间：2020/8/23 2:33:15 apache-spark amazon-emr aws-glue

本文介绍了从EMR迁移到AWS Glue后在Spark SQL中找不到表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在EMR上有Spark作业，并且将EMR配置为对Hive和Spark元数据使用Glue目录.

I have Spark jobs on EMR, and EMR is configured to use the Glue catalog for Hive and Spark metadata.

我创建了Hive外部表，它们出现在Glue目录中，我的Spark作业可以在Spark SQL中像spark.sql("select * from hive_table ...")

I create Hive external tables, and they appear in the Glue catalog, and my Spark jobs can reference them in Spark SQL like spark.sql("select * from hive_table ...")

现在，当我尝试在Glue作业中运行相同的代码时，它将失败，并显示找不到表"错误.看来Glue作业没有使用Spark SQL在EMR中运行的相同方式使用Spark SQL的Glue目录.

Now, when I try to run the same code in a Glue job, it fails with "table not found" error. It looks like Glue jobs are not using the Glue catalog for Spark SQL the same way that Spark SQL would running in EMR.

我可以通过使用Glue API并将数据帧注册为临时视图来解决此问题:

I can work around this by using Glue APIs and registering dataframes as temp views:

create_dynamic_frame_from_catalog(...).toDF().createOrReplaceTempView(...)

但是有一种自动执行此操作的方法吗?

but is there a way to do this automatically?

从EMR迁移到AWS Glue后在Spark SQL中找不到表 [英] Tables not found in Spark SQL after migrating from EMR to AWS Glue

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从EMR迁移到AWS Glue后在Spark SQL中找不到表 [英] Tables not found in Spark SQL after migrating from EMR to AWS Glue

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭