写入Impala时自动创建Impala表的数据框 [英] Dataframe to automatically create Impala table when writing to Impala

查看:443
本文介绍了写入Impala时自动创建Impala表的数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有保存Spark Dataframe的功能,即当将数据写入Impala表时,如果以前未在Impala中创建此表,也会创建该表.

I would like to know if there is any feature of Spark Dataframe saving, that when it writes data to an Impala table, it also creates that table when this table was not previously created in Impala.

例如,代码:

myDataframe.write.mode(SaveMode.Overwrite).jdbc(jdbcURL, "books", connectionProperties)

如果该表不存在,则应创建该表.

should create the table if it doesn't exists.

表架构应从数据框架构中确定.

The table schema should be determined from the dataframe schema.

期待您的建议/想法.

关于, 弗洛林

推荐答案

我过去通过带有相关DDL的mutateStatement.execute创建了该表.我使用SPARK 2.x进行了检查,并自动创建了append. Sp append就是您所需要的.

I created in the past the table via a mutateStatement.execute with the relevant DDL. I checked with SPARK 2.x and append creates its autiomatically as well. Sp append is all you need.

对于JDBC:

jdbcDF.write.mode("append").jdbc(url, table, prop) 

对于通过SPARK 2.x自动配置单元上下文的配置单元:

For Hive via SPARK 2.x auto hive context:

x.write.mode("append").saveAsTable("a_hive_table_xx") 

这篇关于写入Impala时自动创建Impala表的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆