Spark Scala中的SaveAsTable:HDP3.x [英] SaveAsTable in Spark Scala: HDP3.x

查看:247
本文介绍了Spark Scala中的SaveAsTable:HDP3.x的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Spark中有一个数据帧,我将其保存为表格中的蜂巢,但出现以下错误消息.

I have one dataframe in Spark I'm saving it in my hive as a table.But getting below error message.

    java.lang.RuntimeException:
    com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector
    does not allow create table as select.at scala.sys.package$.error(package.scala:27)

任何人都可以帮助我如何将其另存为配置单元中的表格.

can anyone please help me how should i save this as table in hive.

    val df3 = df1.join(df2, df1("inv_num") === df2("inv_num")  // Join both dataframes on id column
    ).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary")) 
    .otherwise(
    when(df1("salary") > df2("salary"), df1("salary") + df2("salary"))  // 5000+3000=8000  check
    .otherwise(df2("salary"))))    // insert from second dataframe
    .drop(df1("salary"))
    .drop(df2("salary"))
    .withColumnRenamed("finalSalary","salary")

    }
    }

    //below code is not working when I'm executing below command its throwing error as 

    java.lang.RuntimeException:
    com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector
    does not allow create table as select.at scala.sys.package$.error(package.scala:27)

     df3.write.
     format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector")
    .option("database",  "dbname")
    .option("table", "tablename")
    .mode("Append")
    .saveAsTable("tablename")

注意:表已经在数据库中,并且我正在使用HDP 3.x.

Note: Table is already available in database and I m using HDP 3.x.

推荐答案

查看以下解决方案是否对您有用,

See if below solution works for you,

val df3 = df1.join(df2, df1("inv_num") === df2("inv_num")  // Join both dataframes on id column
    ).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary")) 
    .otherwise(
    when(df1("salary") > df2("salary"), df1("salary") + df2("salary"))  // 5000+3000=8000  check
    .otherwise(df2("salary"))))    // insert from second dataframe
    .drop(df1("salary"))
    .drop(df2("salary"))
    .withColumnRenamed("finalSalary","salary")

val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()

df3.createOrReplaceTempView("<temp-tbl-name>")
hive.setDatabase("<db-name>")
hive.createTable("<tbl-name>")
.ifNotExists()

sql("SELECT salary FROM <temp-tbl-name>")
.write
.format(HIVE_WAREHOUSE_CONNECTOR)
.mode("append")
.option("table", "<tbl-name>")
.save()     

这篇关于Spark Scala中的SaveAsTable:HDP3.x的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆