从配置单元表读取并使用spark sql将其写回 [英] Read from a hive table and write back to it using spark sql
问题描述
我正在使用Spark SQL读取Hive表并将其分配给scala val
I am reading a Hive table using Spark SQL and assigning it to a scala val
val x = sqlContext.sql("select * from some_table")
然后,我对数据框x进行一些处理,最后给出一个数据框y,该数据框具有与表some_table完全相同的模式.
Then I am doing some processing with the dataframe x and finally coming up with a dataframe y , which has the exact schema as the table some_table.
最后,我尝试将覆盖y数据帧插入到同一配置单元表some_table中
Finally I am trying to insert overwrite the y dataframe to the same hive table some_table
y.write.mode(SaveMode.Overwrite).saveAsTable().insertInto("some_table")
然后我得到了错误
org.apache.spark.sql.AnalysisException:无法将覆盖插入到正在读取的表中
org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from
我尝试创建一个插入sql语句,并使用sqlContext.sql()触发它,但是它也给了我同样的错误.
I tried creating an insert sql statement and firing it using sqlContext.sql() but it too gave me the same error.
有什么办法可以绕过此错误?我需要将记录插入回到同一张表中.
Is there any way I can bypass this error? I need to insert the records back to the same table.
您好,我尝试按照建议的方法进行操作,但仍然遇到相同的错误.
Hi I tried doing as suggested , but still getting the same error .
val x = sqlContext.sql("select * from incremental.test2")
val y = x.limit(5)
y.registerTempTable("temp_table")
val dy = sqlContext.table("temp_table")
dy.write.mode("overwrite").insertInto("incremental.test2")
scala> dy.write.mode("overwrite").insertInto("incremental.test2")
org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from.;
推荐答案
您应该首先将DataFrame y
保存在临时表中
You should first save your DataFrame y
in a temporary table
y.write.mode("overwrite").saveAsTable("temp_table")
然后,您可以覆盖目标表中的行
Then you can overwrite rows in your target table
val dy = sqlContext.table("temp_table")
dy.write.mode("overwrite").insertInto("some_table")
这篇关于从配置单元表读取并使用spark sql将其写回的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!